Gerhard Niklasch on Mon, 27 Apr 1998 21:00:59 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

64-bit benchmarker wanted


Any kind soul listening who has access to a 64-bit machine,
and a working PARI-GP 2.0.[>=5].alpha on it, and with a few
seconds CPU time to spare?

I'd like to see timings  (GP's minutes-and-milliseconds timer
will be accurate enough)  for each of the following loops:
#
for(n=(a=3^12*35)-2^13,a,factor(n);) \\ 25 bits
for(n=(a=3^15*5^3)-2^12,a,factor(n);) \\ 31 bits
for(n=(a=3^20*35^2)-2^11,a,factor(n);) \\ 41 bits
for(n=(a=3^21*35^3)-2^10,a,factor(n);) \\ 49 bits
for(n=(a=3*7^21-10^5)-2^9,a,factor(n);) \\ 61 bits
for(n=(a=135*7^22-10^6)-2^8,a,factor(n);) \\ 69 bits
for(n=(a=7^20*11^7-10^7)-2^7,a,factor(n);) \\ 81 bits
for(n=(a=495*11^24-10^8)-2^6,a,factor(n);) \\ 93 bits

and each of the same loops with `factor' replaced by `moebius',
with the primelimit default set to
1000000, 500000 (default), 250000, 125000, 60000, 30000, 15000,
7500, 2500, 800, 0 [effectively 257].

Actually, not all these 176 combinations will be of interest.
I really only want to know which primelimit setting gives the
shortest time, depending on the operand size, so with luck,
the readings for three different primelimit settings per loop
will suffice.

As a comparison point, on my P133, the best choices turn out to be:
25 bits --    2500  (pronounced minimum, probably a little above 2500)
31 bits --   15000
41 bits --   60000  (from here on, fairly flat and broad minima)
49 bits --   60000 for factor, 125000 for moebius
61 bits --  125000
69 bits --  125000
81 bits --  125000  (from here on, flat within a few %)
93 bits --  125000
(the moebius optimum tends to be slightly higher than that for factor;
this becomes more obvious when the operands get even larger).  I expect
the results on a 64-bit machine to be similar.  Depending on the speed
of the assembler division, the optima on an Alpha CPU (say) may lie even
lower than on the Pentium architecture.

The moral of all this is that for numbers small enough to be factored
by PARI within half a minute, you _don't_ help PARI's factoring engine
at all if you increase the primelimit -- in fact, you hinder it.  At
least for single-word numbers, I'm about to hardwire a limit  (depending
on the precise size)  into the code beyond which the precomputed primes
will be ignored.  One iteration of the 25-bits factor loop on my P133
takes on average about 700 microseconds at a limit of 2500, but more
than 20 milliseconds -- nearly 30 times as long! -- at the default
setting of 500000:  we're talking about more than a few per cent at
this end of the range.  At 31 bits, it's still a factor of about 10.

Thanks in advance, Gerhard