Paulo Ney de Souza on Tue, 24 Feb 2004 20:22:17 +0100


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

RE: Pattern Match within PARI/GP


There is a subtle difference between the two problems, you see the 
Human Genome data is static, generated by some sequencing machines,
it stays put, and written to a disk after it is found. You can then
go there with any tools and try to find patterns, etc ... Our example
problem here is "dynamic" in the sense that I want to generate the 
first 1 million digits and depending on the sequences found, go look
a bit further in the "second" million digit set, etc ...

That going in between PARI and Perl and then back to PARI is exactly
what is breaking my neck in terms of speed... and my reason to try
to do this inside PARI.

   For what it's worth, this is precisely the sort of thing
   Perl excels at.  E.g. gp myfile.gp > out.txt; perl myperl.pl out.txt
   where myperl.pl does something like
   $line=$_; print $line if $line =~ m/xyxyxy/;
   Remember that Perl handles those massive datasets in the Human
   Genome Project.