hermann on Fri, 30 Jun 2023 15:12:01 +0200 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: Can PARI code be forced to run in 32MB L3 cache, mostly without RAM? |
So does below execution with PARI code say, that 3,372,491,167 times cache was accessed, and 37,845,884 times L1 had a cache miss?
How t measure L2 misses and L3 cache misses? How to see how much RAM access was done?What about parisize and pasisizemax settings for making PARI code run inside cache?
hermann@7600x:~/RSA_numbers_factored/c++$ LD_LIBRARY_PATH=/home/hermann/Downloads/pari-2.15.3/GPDIR/lib \
perf stat -e l2_cache_accesses_from_dc_misses,all_data_cache_accesses,cycles,task-clock \./sqrtm1.smallest_known_1million_digit_prime
a = y^(-1) (mod p) [powm]; a *= x; a %= p 0.195755s [M,V] = halfgcdii(sqrtm1, p) 0.217144s [x,y] = [V[2], M[2,1]] 1e-06s donePerformance counter stats for './sqrtm1.smallest_known_1million_digit_prime':
37,845,884 l2_cache_accesses_from_dc_misses # 79.826 M/sec
3,372,491,167 all_data_cache_accesses # 7.113 G/sec 2,572,672,013 cycles # 5.426 GHz474.10 msec task-clock # 0.999 CPUs utilized
0.474457023 seconds time elapsed 0.466410000 seconds user 0.008041000 seconds sys hermann@7600x:~/RSA_numbers_factored/c++$ Regards, Hermann. On 2023-06-29 19:58, hermann@stamm-wilbrandt.de wrote:
There are 31 hits for "cache", but none helped me to answer subject question:https://pari.math.u-bordeaux.fr/pub/pari/manuals/2.15.1/libpari.pdf If it is possible to restrict most part of computations to L3 cache, does one need to restrict parisize and parisizemax to that value as well? Or lower? My 7600X CPU has these caches: https://github.com/Hermann-SW/7600X#details-of-pc L1 Cache 384KB L2 Cache 6MB L3 Cache 32MB Under Linux, how can I tell whether program runs mostly from cache and not from RAM? My only C++ PARI code sofar: https://github.com/Hermann-SW/RSA_numbers_factored/blob/main/c%2B%2B/sqrtm1.smallest_known_1million_digit_prime.cc#L68-L93 I could reduce input prime size down from 1million digits for testing to fit into L3 cache. Regards, Hermann.