Bill Allombert on Fri, 01 Nov 2024 18:22:30 +0100 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: Why 70% / 51% slowdown of PARI/GP for pthread over single for 7950X / 7600X AMD CPUs? |
On Fri, Nov 01, 2024 at 05:49:05PM +0100, hermann@stamm-wilbrandt.de wrote: > It took a while to determine pthread versus single to cause slower > computation times on faster AMD 7950X CPU over slower AMD 7600X CPU. > > With same /etc/gprc, both on Ubuntu 22.04, GP 2.16.1 alpha: > > ________pthread single PARI/GP > 7950X___0:34:51 0:20:33 1.70 > 7600X___0:36:15 0:24:04 1.51 > > The h:mm:ss runtimes are for 729 million evaluations of > Cayley-Menger determinant for computing volume of > tetrahedron given 6 edge distances: > > ? T=128;L=30;forvec (X=[[1,L],[1,L],[1,L],[1,L],[1,L],[1,L]], M=[0,1,1,1,1;1,0,X[1]^2,X[2]^2,X[3]^2;1,X[1]^2,0,X[4]^2,X[5]^2;1,X[2]^2,X[4]^2,0,X[6]^2;1,X[3]^2,X[5]^2,X[6]^2,0];if(matdet(M)==2*T,print(X))); > [2, 2, 2, 2, 2, 2] > cpu time = 34min, 50,094 ms, real time = 34min, 50,196 ms. > ? > > 1) What are explanations for the massive slowdowns for pthread? This is a known issue, caused by the use of thread-local variables. There are some way to reduce the slowdown: - use gp-sta instead of gp-dyn - use the compiler flag --flto=auto Alternatively, you can avoid pthread and use MPI, which is more annoying to use but do not require thread-local variables. Using this benchmark: T=128;L=12;forvec (X=[[1,L],[1,L],[1,L],[1,L],[1,L],[1,L]], M=[0,1,1,1,1;1,0,X[1]^2,X[2]^2,X[3]^2;1,X[1]^2,0,X[4]^2,X[5]^2;1,X[2]^2,X[4]^2,0,X[6]^2;1,X[3]^2,X[5]^2,X[6]^2,0];if(matdet(M)==2*T,print(X))); ## I get: % Olinux-x86_64/gp-dyn -q < ben *** last result computed in 13,326 ms. % Olinux-x86_64/gp-sta -q < ben *** last result computed in 13,144 ms. % Olinux-x86_64-pthread/gp-dyn -q < ben *** last result: cpu time 22,510 ms, real time 22,511 ms. % Olinux-x86_64-pthread/gp-sta -q < ben *** last result: cpu time 13,675 ms, real time 13,677 ms. Olinux-x86_64-pthread/gp-sta + --flto: *** last result: cpu time 13,145 ms, real time 13,147 ms. Cheers, Bill.