J E Cremona on Wed, 21 Nov 2018 12:10:22 +0100

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: some very slow mf* functions

Thanks a lot for the detailed, informative and fast reply!

I'll come up with some specific examples in due course.  Some highlights:  I have run all weight 1 spaces for levels up to 4000 (all characters);  some of these take a large percentage of the RAM on a 512GB machine.  The weight 2 trivial character case is in a different set of computations, where I expect to find (when it finishes) a form with inner twist by -7 whose base-change to Q(sqrt(-7)) is the twist of a Bianchi form in my database.  But that is perhaps too mathematical for this mailing list ;)

Since I am running many level/weight/character spaces in parallel already, as many as my machines (and colleagues) will tolerate, using multiple threads is not normally something which would help me.  But in some cases it might, such as when I have a few isolated hard cases remaining.


On Wed, 21 Nov 2018 at 10:50, Karim Belabas <Karim.Belabas@math.u-bordeaux.fr> wrote:
* Karim Belabas [2018-11-21 11:23]:
> * J E Cremona [2018-11-21 10:39]:
> > If I create the cuspidal newspace of level 11025, weight 2 trivial
> > character using Snew=mfinit([11025,2],0) , which has dimension 317, and
> > then compute some Hecke matrices, they take an enormous amount of time:
> > several days for mfheckemat(Snew,29).  Why?
> >
> > My own C++ modular symbols code can compute all Tp for p<100 in a few
> > seconds, admittedly on the whole space (dimension 1684) not the newspace.

> 1) the complexity of mfinit is dominated by the cost computing the base change
> matrix between Miller's basis and the horrible "random" basis afforded by
> trace forms, which is O~(d^\omega + 1), with \omega = log_2 7 in PARI's
> implementation.
> 2) the cost of computing T_p is O(d^3 p^{3/2})  [ this is really a 3 here,
> not an \omega ]

Addendum intentionally omitted in first post: the above cost 2) is for a
one-shot computation, we have a caching mechanism that in principle
would allow to compute all T_q for q < p within the same time bound, but
it is unfortunately essentially *disabled* by default.

Once a computation is over [ or in the middle of a computation
temporarily interrupted with <C-C> if you leave 'breakloop' enabled ],
try getcache() and check out the third column (cache misses). If this
number is huge [ e.g. 10^6 or above ... ], then the cache policy is
inadequate for your example.

Compared to early versions of the package the cache has been tuned down
so as to use very little memory (a few hundred MB) and most importantly
there is no way for the user to change it without changing default
values in src/basemat/mftrace.c and recompiling.

We used to allow users change the cache limits but I was convinced out
of it because it was "too complicated/delicate to use properly".
The hardcoded cache policy is suitable for levels up to 2000 or so. It's
completely inadequate for levels above 4000; meaning you can probably
gain a 10-fold factor in timings at the expense of letting the memory
grow to a few tens of GBytes...


Karim Belabas, IMB (UMR 5251)  Tel: (+33) (0)5 40 00 26 17
Universite de Bordeaux         Fax: (+33) (0)5 40 00 21 23
351, cours de la Liberation    http://www.math.u-bordeaux.fr/~kbelabas/
Talence (France)       http://pari.math.u-bordeaux.fr/  [PARI/GP]

Prof J E Cremona
Warwick Mathematics Institute
University of Warwick