hermann on Sun, 06 Aug 2023 21:28:56 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: efficient transformations between "sum of squares" and "sqrt(-1) (mod p)" for 1M-digit prime


Thanks for the caculations.

In addition to the GP script I made libgmpxx with libpari C++ version available:
https://gmplib.org/list-archives/gmp-discuss/2023-August/006908.html
https://github.com/Hermann-SW/RSA_numbers_factored/blob/main/c%2B%2B/sqrtm1.9383761_digit.largest_known_1mod4_prime.cc

From sum of squares to sqrtm1 C++ and GP runtimes are the same.
From sqrtm1 to sum of squares GP is faster, likely due to overhead of calling "halfgcdii()" from C++.

Regards,

Hermann.

On 2023-08-03 16:53, Joerg Arndt wrote:
Btw. if you happen to know the decomposition p = A^2 + B^2
then sqrt(-1) = +- A/B ( = -+ B/A )

Here is why, setting i = sqrt(-1):
p = A^2 + B^2 = A^2 - (i^2) * B^2 == 0
so (i^2) = A^2 / B^2, so i = +- A/B

This is especially nice if B = 1 or 2,
1^(-1) = 1 (of course) and 2^(-1) = (p+1)/2.

Best regards,   jj


* hermann@stamm-wilbrandt.de <hermann@stamm-wilbrandt.de> [Aug 03. 2023 16:10]:
On 2023-06-20 02:02, hermann@stamm-wilbrandt.de wrote:
> I learned on this mailing list about "halfgcd()" allowing to
> efficiently compute sum of squares given "sqrtm1 = sqrt(-1) (mod p)",
> and for computing "sqrtm1" from sum of squares as "x*y^(-1) (mod p)".
>
I just finished big project to determine "sqrt(-1) (mod p)" for largest
known 9,383,761-digit prime p =1 (mod 4):
https://github.com/Hermann-SW/9383761-digit-prime#readme

I stopped the 75.4 days expected runtime run after 9d 20.9h, because I
learned about llr tool based on gwnum library, and computed "sqrt(-1) (mod p)" in 13.2h plus 10s post processing. That is 137x faster than with my
libgmpxx based code.

gwnum library is x86 specific, but allows for huge speedups using AVX512 instructions, and multi-core processing a single square or square+mult (mod
p). And 31,172,177 such operations had to be done ...

New PARI/GP script with that sqrtm1 constant:
https://github.com/Hermann-SW/RSA_numbers_factored/blob/main/pari/sqrtm1.9383761_digit.largest_known_1mod4_prime.gp

Nice, computing sum of squares for that huge prime from sqrtm1 in 2.9s only,
and computing sqrtm1 given x and y in 4.2s:

hermann@7600x:~/RSA_numbers_factored/pari$ gp -q
? \r sqrtm1.9383761_digit.largest_known_1mod4_prime
9383761-digit prime p (31172179 bits)
[M,V] = halfgcd(sqrtm1, p)
  ***   last result computed in 2,854 ms.
[x,y] = [V[2], M[2,1]]
  ***   last result computed in 1 ms.
sqrtm1 = lift(Mod(x/y, p))
  ***   last result computed in 5,929 ms.
sqrtm1 = lift(x*Mod(1/y, p))
  ***   last result computed in 4,531 ms.
sqrtm1 = lift(Mod(x, p)/y)
  ***   last result computed in 4,203 ms.
done, all asserts OK
?


Reards,

Hermann.