Parallel programming

These function are only available if PARI was configured using Configure --mt = .... Two multithread interfaces are supported:

* POSIX threads

* Message passing interface (MPI)

As a rule, POSIX threads are well-suited for single systems, while MPI is used by most clusters. However the parallel GP interface does not depend on the chosen multithread interface: a properly written GP program will work identically with both.


parapply(f, x)

Parallel evaluation of f on the elements of x. The function f must not access global variables or variables declared with local(), and must be free of side effects.

  parapply(factor,[2^256 + 1, 2^193 - 1])

factors 2256 + 1 and 2193 - 1 in parallel.

  {
    my(E = ellinit([1,3]), V = vector(12,i,randomprime(2^200)));
    parapply(p->ellcard(E,p), V)
  }

computes the order of E(𝔽p) for 12 random primes of 200 bits.

The library syntax is GEN parapply(GEN f, GEN x).


pareval(x)

Parallel evaluation of the elements of x, where x is a vector of closures. The closures must be of arity 0, must not access global variables or variables declared with local and must be free of side effects.

Here is an artificial example explaining the MOV attack on the elliptic discrete log problem (by reducing it to a standard discrete log over a finite field):

  {
    my(q = 2^30 + 3, m = 40 * q, p = 1 + m^2); \\ p, q are primes
    my(E = ellinit([0,0,0,1,0] * Mod(1,p)));
    my([P, Q] = ellgenerators(E));
    \\ E(Fp) ~ Z/m P + Z/m Q and the order of the
    \\ Weil pairing <P,Q> in (Z/p)* is m
    my(F = [m,factor(m)], e = random(m), R, wR, wQ);
    R = ellpow(E, Q, e);
    wR = ellweilpairing(E,P,R,m);
    wQ = ellweilpairing(E,P,Q,m); \\ wR = wQ^e
    pareval([()->znlog(wR,wQ,F), ()->elllog(E,R,Q), ()->e])
  }

Note the use of my to pass "arguments" to the functions we need to evaluate while satisfying the listed requirements: closures of arity 0 and no global variables (another possibility would be to use export). As a result, the final three statements satisfy all the listed requirements and are run in parallel. (Which is silly for this computation but illustrates the use of pareval.) The function parfor is more powerful but harder to use.

The library syntax is GEN pareval(GEN x).


parfor(i = a, {b}, expr1, {r}, {expr2})

Evaluates in parallel the expression expr1 in the formal argument i running from a to b. If b is set to +oo, the loop runs indefinitely. If r and expr2 are present, the expression expr2 in the formal variables r and i is evaluated with r running through all the different results obtained for expr1 and i takes the corresponding argument.

The computations of expr1 are started in increasing order of i; otherwise said, the computation for i = c is started after those for i = 1,..., c-1 have been started, but before the computation for i = c+1 is started. Notice that the order of completion, that is, the order in which the different r become available, may be different; expr2 is evaluated sequentially on each r as it appears.

The following example computes the sum of the squares of the integers from 1 to 10 by computing the squares in parallel and is equivalent to parsum (i = 1, 10, i^2):

  ? s=0;
  ? parfor (i=1, 10, i^2, r, s=s+r)
  ? s
  %3 = 385

More precisely, apart from a potentially different order of evaluation due to the parallelism, the line containing parfor is equivalent to

  ? my (r); for (i=1, 10, r=i^2; s=s+r)

The sequentiality of the evaluation of expr2 ensures that the variable s is not modified concurrently by two different additions, although the order in which the terms are added is nondeterministic.

It is allowed for expr2 to exit the loop using break/next/return. If that happens for i = c, then the evaluation of expr1 and expr2 is continued for all values i < c, and the return value is the one obtained for the smallest i causing an interruption in expr2 (it may be undefined if this is a break/next). In that case, using side-effects in expr2 may lead to undefined behavior, as the exact number of values of i for which it is executed is nondeterministic. The following example computes nextprime(1000) in parallel:

  ? parfor (i=1000, , isprime (i), r, if (r, return (i)))
  %1 = 1009


parforeach(V, x, expr1, {r}, {expr2})

Evaluates in parallel the expression expr1 in the formal argument x, where x runs through all components of V. If r and expr2 are present, evaluate sequentially the expression expr2, in which the formal variables x and r are replaced by the successive arguments and corresponding values. The sequential evaluation ordering is not specified:

  ? parforeach([50..100], x,isprime(x), r, if(r,print(x)))
  53
  67
  71
  79
  83
  89
  97
  73
  59
  61


parforprime(p = a, {b}, expr1, {r}, {expr2})

Behaves exactly as parfor, but loops only over prime values p. Precisely, the functions evaluates in parallel the expression expr1 in the formal argument p running through the primes from a to b. If b is set to +oo, the loop runs indefinitely. If r and expr2 are present, the expression expr2 in the formal variables r and p is evaluated with r running through all the different results obtained for expr1 and p takes the corresponding argument.

It is allowed fo expr2 to exit the loop using break/next/return; see the remarks in the documentation of parfor for details.


parforprimestep(p = a, {b}, q, expr1, {r}, {expr2})

Behaves exactly as parfor, but loops only over prime values p in an arithmetic progression Precisely, the functions evaluates in parallel the expression expr1 in the formal argument p running through the primes from a to b in an arithmetic progression of the form a + k q. (p = a (mod q)) or an intmod Mod(c,N). If b is set to +oo, the loop runs indefinitely. If r and expr2 are present, the expression expr2 in the formal variables r and p is evaluated with r running through all the different results obtained for expr1 and p takes the corresponding argument.

It is allowed fo expr2 to exit the loop using break/next/return; see the remarks in the documentation of parfor for details.


parforstep(i = a, {b}, s, expr1, {r}, {expr2})

Evaluates in parallel the expression expr1 in the formal argument i running from a to b in steps of s (can be a positive real number, an intmod for an arithmetic progression, or finally a vector of steps, see forstep). If r and expr2 are present, the expression expr2 in the formal variables r and i is evaluated with r running through all the different results obtained for expr1 and i takes the corresponding argument.

  ? parforstep(i=3,8,2,2*i,x,print([i,x]))
  [3, 6]
  [5, 10]
  [7, 14]
  ? parforstep(i=3,8,Mod(1,3),2*i,x,print([i,x]))
  [4, 8]
  [7, 14]
  ? parforstep(i=3,10,[1,3],2*i,x,print([i,x]))
  [3, 6]
  [4, 8]
  [7, 14]
  [8, 16]

The library syntax is void parforstep0(GEN i, GEN b = NULL, GEN s, GEN expr1, GEN r = NULL).


parforvec(X = v, expr1, {j}, {expr2}, {flag})

Evaluates the sequence expr2 (dependent on X and j) for X as generated by forvec, in random order, computed in parallel. Substitute for j the value of expr1 (dependent on X).

It is allowed fo expr2 to exit the loop using break/next/return, however in that case, expr2 will still be evaluated for all remaining value of p less than the current one, unless a subsequent break/next/return happens.


parselect(f, A, {flag = 0})

Selects elements of A according to the selection function f, done in parallel. If flag is 1, return the indices of those elements (indirect selection) The function f must not access global variables or variables declared with local(), and must be free of side effects.

The library syntax is GEN parselect(GEN f, GEN A, long flag).


parsum(i = a, b, expr)

Sum of expression expr, the formal parameter going from a to b, evaluated in parallel in random order. The expression expr must not access global variables or variables declared with local(), and must be free of side effects.

  ? parsum(i=1,1000,ispseudoprime(2^prime(i)-1))
  cpu time = 1min, 26,776 ms, real time = 5,854 ms.
  %1 = 20

returns the number of prime numbers among the first 1000 Mersenne numbers. Note. This function is only useful when summing entries that are too large to fit in memory simultaneously. To sum a small number of values, using vecsum(parvector()) is likely to be more efficient; the sumation order also becomes deterministic.


parvector(N, i, expr)

As vector(N,i,expr) but the evaluations of expr are done in parallel. The expression expr must not access global variables or variables declared with local(), and must be free of side effects.

  parvector(10,i,quadclassunit(2^(100+i)+1).no)

computes the class numbers in parallel.