Bill Allombert on Tue, 10 Jul 2018 23:59:43 +0200 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: SIGSEGV on isprime |
On Tue, Jul 10, 2018 at 06:35:40PM +0200, Ján Jančár wrote: > Hi all, > > While running pari on some grid computing machines I keep encountering > this mysterious error, which only happens when working on certain > machines and not on others. > > To reproduce, I compiled the following: > > on: > Linux 3.16.0-5-amd64 #1 SMP Debian > 3.16.51-3+deb8u1+zs2 (2018-01-12) x86_64 GNU/Linux > > where it run just fine. > However when run(and compiled) on: > Linux 4.9.0-6-amd64 #1 SMP Debian > 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux > > It SIGSEGVs in isprime(): > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00002aaaab1f771c in red_montgomery (T=0x2aaae879e6a0, N=0x2aaae879efe0, inv=2796584439883844767) at ../src/kernel/gmp/mp.c:1013 > > 1013 while (Td < (GEN)av) { t = subllx(*++Td, *++Nd); *Td = t; } > > (gdb) bt > > #0 0x00002aaaab1f771c in red_montgomery (T=0x2aaae879e6a0, N=0x2aaae879efe0, inv=2796584439883844767) at ../src/kernel/gmp/mp.c:1013 > > #1 0x00002aaaab33a0ac in _sqr_montred (E=0x2aaae879ef38, x=0x2aaae879e6d0) at ../src/basemath/arith1.c:3311 > > #2 0x00002aaaab33a138 in _mul2_montred (E=0x2aaae879ef38, x=0x2aaae879e6d0) at ../src/basemath/arith1.c:3326 > > #3 0x00002aaaab39d641 in gen_pow_fold_i (x=0x2aaae879eef0, N=0x2aaae879efa0, E=0x2aaae879ef38, sqr=0x2aaaab33a068 <_sqr_montred>, msqr=0x2aaaab33a10d <_mul2_montred>) > > at ../src/basemath/bb_group.c:254 > > #4 0x00002aaaab33abf3 in Fp_pow (A=0x2aaaabaa3998 <readonly_constants+56>, K=0x2aaae879efa0, N=0x2aaae879efe0) at ../src/basemath/arith1.c:3508 > > #5 0x00002aaaab5ca498 in bad_for_base (S=0x7fffffffe210, a=0x2aaaabaa3998 <readonly_constants+56>) at ../src/basemath/prime.c:95 > > #6 0x00002aaaab5cbaf2 in BPSW_psp (N=0x2aaae879efe0) at ../src/basemath/prime.c:570 > > #7 0x00002aaaab5cca58 in isprime (x=0x2aaae879efe0) at ../src/basemath/prime.c:846 > > #8 0x0000000000400788 in main () > > (gdb) info locals > > __value = 8722076062158158581 > > __arg1 = 144115188075855878 > > __arg2 = 10180160215563133591 > > __temp = 18446744073709551615 > > av = 46913531930272 > > Te = 0x2aaae867e688 > > Td = 0x2aaae867e6a0 > > Ne = 0x2aaae867efe8 > > Nd = 0x2aaae867f000 > > scratch = 0x2aaae867e680 > > i = 2 > > j = 2 > > m = 9020403664262637533 > > t = 8722076062158158581 > > d = 4 > > k = 2 > > carry = 1 > > hiremainder = 4978068440930021014 > > overflow = 1 > > (gdb) info args > > T = 0x2aaae867e6a0 > > N = 0x2aaae867efe0 > > inv = 2796584439883844767 > > (gdb) quit > > I compiled pari 2.9.5 / 2.10.1 / current git master, with > ./Configure --enable-tls -g > and the error happens in all of the versions. > > Any ideas on what might be causing this? ldd of the binary on both > machines shows the same libraries are used, so it is very mysterious to > me that it works on one and not on the other. Why are you using --enable-tls ? Does it makes a difference ? Are you using the same compiler ? The same processor ? This code has not changed between 2.9.5 and 2.10.1, however it is rather messy, so maybe it is not compiled correctly. You can also try ./Configure --kernel=none Cheers, Bill