Re: 3^k performance

Ruud H.G. van Tol on Sat, 26 Nov 2022 10:21:21 +0100

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: 3^k performance

To: pari-dev@pari.math.u-bordeaux.fr
Subject: Re: 3^k performance
From: "Ruud H.G. van Tol" <rvtol@isolution.nl>
Date: Sat, 26 Nov 2022 10:20:20 +0100
Delivery-date: Sat, 26 Nov 2022 10:21:21 +0100
Dkim-signature: v=1; a=rsa-sha256; c=simple/simple; d=isolution.nl; s=soverin; t=1669454415; bh=K3YK2YPh6q3kf5nWwrimouaA8cUmHMBMAt1CBFmhve8=; h=Date:Subject:To:References:From:In-Reply-To:From; b=aN5IFmCiP9PnrcRclF6v7QyywaOQjDL9OzAIUv64CcJ+B4Xj95AAT1K8u+GW35Tph D0Evthz0FEoUUGOsZ+JUf2Pm+YSyC1aen4vMt0mvoEeHNbCUN8t0b/4CRAQXIjcPbb az4iES8NyrLDPhYmZCZVbQfPzIgbMni1TxIDqJMJ+qAPCYqIZPvHZBSFAv7sLsv8lA 67exsWiA0YnCR8kw7gI03/68d/RYuUglvgPUmltov27WDYKod7wtEe+TYiAFVkph8C QIh7egBxrEaLt3mYm5lgqMokZVU4Ly0c1QRCPPx+6Ld6hHJiziq+vcXPhMlkrUdYn5 T6F1LtNmq4K6A==
In-reply-to: <Y4HOhr2XXH1wZ3dF@math.u-bordeaux.fr>
References: <de8f200c-386e-3028-f44b-519b9ce445c1@isolution.nl> <Y4HOhr2XXH1wZ3dF@math.u-bordeaux.fr>



On 2022-11-26 09:29, Karim Belabas wrote:

* Ruud H.G. van Tol [2022-11-26 08:47]:


A022330_1(n)=1+n+sum(k=1,n,logint(3^k,2))

A022330_2(n)=my(t=1);1+n+sum(k=1,n,logint(t*=3,2))

? A022330_1(10^5)
cpu time = 8,322 ms, real time = 8,352 ms.
%15 = 7924941755

? A022330_2(10^5)
cpu time = 215 ms, real time = 217 ms.
%16 = 7924941755

So about a 40x difference.

Is that worthwhile to look into?


This is expected:

(09:02) gp > for(k=1,10^5,3^k)
time = 8,409 ms.
(09:02) gp > my(t=1);for(k=1,10^5,t*=3)
time = 175 ms.

Assume x is a small integer (here x = 3).

Computing x^k as x * t where t = x^(k-1) is already known involves a
single multiplication by x, which is done in linear tine wrt the size of t,
i.e. about s = k * log2(x).

Computing x^k from scratch involves essentially log2(k) consecutive
squarings of an increasingly large integer [ and also a few
multiplications by x whose cost is negligible ]. The final squaring
only, namely (x^r)^2 where r = k >> 1, has a cost which is superlinear
in the size of x^r, i.e about s / 2. Concretely :

(09:18) gp > t = 3^(10^5 >> 1);
(09:18) gp > for(i=1,10^4,t^2)
time = 1,805 ms.
(09:19) gp > T = t^2;
(09:19) gp > for(i=1,10^4,T^2)
time = 4,726 ms.

So doubling the size of t multiplies the squaring time by about 2.62 in
this range, which would correspond to a complexity M(s) ~ s^1.39
(where 1.39 ~ log2(2.62)) for the cost of a multiplication of size s.

Now compare s and (s / 2)^1.39 for s = log2(3^10^5), the actual size
you're interested in:

(09:20) gp > s = log(3^10^5)/log(2);
(09:20) gp > (s/2)^1.39 / s
%3 = 40.698247890865989130716669156266722278

So a 40 x difference is quite expected !

How to best approach that?


Never compute from scratch when previous results can be reused :-)


Sure, and that is kind of what I imagined was going on.
(am going through the source now)

In the Perl-compiler, we have a peephole-optimiser phase
that operates right after the main compiler phase
that builds the AST, to patch the tree where feasible.
Example: https://metacpan.org/pod/Devel::GoFaster

For the case at hand, I have no idea yet
if it is feasible to either optimise it
at the pre-compilation phase,
or inside the compilation-phase,
or right after it.
(or indeed, leave it to the user)

Even an optional (!) warning
about "sub-optimal code, stupid"
would already help. :)
(which a linter could already do)

-- Ruud

References:
- 3^k performance
  - From: "Ruud H.G. van Tol" <rvtol@isolution.nl>
- Re: 3^k performance
  - From: Karim Belabas <Karim.Belabas@math.u-bordeaux.fr>

Prev by Date: Re: 3^k performance
Next by Date: Re: 3^k performance
Previous by thread: Re: 3^k performance
Next by thread: Re: 3^k performance
Index(es):
- Date
- Thread