Karim BELABAS on Fri, 4 Apr 2003 00:18:34 +0200 (MEST)

 Re: polredabs() again

```On Thu, 3 Apr 2003, Bill Allombert wrote:
> On Wed, Mar 12, 2003 at 03:09:12PM -0500, Igor Schein wrote:
>> Hi,
>>
>> I know there're too many things going on at the same time right now,
>> but I'll still gonna mention this example of polredabs() inefficiency:
>>
>> polredabs(x^8-97418273277417164826810*x^4-97418273278115084139045*x^2+97418273278115084139045);
>>
>> It takes extremely long time.  Basically, I'd like to see either a
>> quick fix ( if possible ) or an acknowledgement of the issue in TODO
>> file :).  I know we've discussed this with Karim in private emails,
>> but it never made it into TODO.
>
> [I was expecting Karim to comment but...]

Hiya...

> Well, the underlying algorithm has a high complexity, so it is not clear
> it is a defficiency in PARI or even that it is fixable, if we insist
> to have the 'absolutly smallest polynomial'.

Let me describe precisely the current algorithm (for future reference, and to
spur ideas:-).

1) Compute an (T2-)LLL-reduced basis for the maximal order
[ fast and reliable nowadays, compared even to stable version. This is
especially efficient when the field is totally real. In this case, the
T2 matrix is integral, and Schnorr-Euchner strategy is used on an
_exact_ matrix, at the lowest possible accuracy. So one sees many
"precision increase" at \g4, but it's a feature of the algorithm, and
the signe we are making good progress: whenever precision remains
"stuck", we are in a quandary ]

2) Enumerate [Fincke-Pohst] all vectors smaller than the _best bound found so
far_ (that's an ellipsoid, for a fixed bound). The latter is defined as
the T2-norm of the smallest element defining the original field
[ checked by computing explicitly the characteristic polynomial C, in
terms of the (known) embeddings, then gcd(C,C'): this is fast, and much
more reliable than checking whether the roots of C are close to each other
(which would be faster) ]

The _best bound_ is initialized using the norm of the smallest suitable
LLL-basis vectors [ it's a famous "conjecture" that one of them will
be suitable, cf. basemath/base1.c:1378:

if (!_polred(x, a, NULL, &chk))
err(talker,"you found a counter-example to a conjecture, please report!");
]

3) To avoid computing too many char. pol., this mode is switched off
after a number of consecutive failures (currently, 5): there are
presumably many subfields, and we expect many other failures. In this
case, the small vectors are cached until a buffer is filled. Then they
are sorted wrt T2-norm, and the characteristic polynomials are computed
(smallest elements first) [ you get the dreaded "sorting..." message at
\g4. If we're unlucky, we compute all characteristic polynomials, find
nothing here, and discard the lot. ].

[ the implementation is generic, smallvectors() actually takes as arguments
a lattice, a bound, the size of the small vectors cache, and pointers to
"check routines" which define what is a "suitable vector" ]

4) If we are lucky, the _best bound so far_ decreases fast, and we avoid
most of the enumeration.

5) When computing the initial _best norm_, I compute the number field
generated by the initial LLL-basis vectors [ succesive compositum(),
that's when you get the "subfields" messages. This requires factoring
huge polynomials over Z. That was my original motivation for studying
van Hoeij's algorithm: this is relavely cheap now ].

The enumeration stops when this tells us that the linear span of
remaining basis vectors is contained in a strict subfield. If we're
lucky, this may prune out a huge part of the enumeration tree.

6) I have neglected the very important '16' flag: replace "maximal order"
by "some order which is cheap to compute" in the description above
[ factor the discriminant up to 'primelimit' only: the order is maximal
at all primes up to primelimit, not maximal at all larger primes occuring
to an odd power in the discriminant of the original generating
polynomial (it is in general maximal when the valuation is even, without
guarantee) ].

Three obvious (trivial) improvements:

*  For some crazy reason, the enumeration backtracking is done in _decreasing_
(lexicographic) order. So that the vectors most susceptible to have large
norm are considered first. It is easy to change, but I never took the time
to do it [ have a look at the smallvectors() code to understand why ]

*  Order the LLL-basis vectors, so as to maximize the length of the initial
segment of the basis generating a strict subfield as in 5)  [ currently, we
use Fincke-Pohst ordering, which minimizes the number of enumerated vectors
for a _fixed_ bound ]. Doing this efficiently is an interesting problem.

*  Use random samples to try and improve quickly on the _best bound_.

Now the bad news: if there are many subfields, providing lots of small
elements, this will still take a lot of time. It can't be helped.

On the other hand, the current specification of polredabs is quite useless.
There's no application whatsoever for a "polynomial of absolute smallest
T2-norm". It's not even guaranteed to have minimal discriminant, or to
yield smallest coefficients. The only one I can see is to give a
pseudo-canonical representative for the field (this helps table builders,
less isomorphism tests...)

In any case, there's an obvious solution: add a further optional argument to
polredabs, to sample up to a certain recursion depth (e.g [N, B] would
include at most N vectors in any linear combination, with coefficients < B in
absolute value, or the standard computed bound, whichever is lowest). And
return the best polynomial found [ possibly the original polynomial ]

Then, we'd have a guaranteed time limit on polredabs() operation,
and a guarantee to get a "sensible" polynomial, if not a "canonical" one.

Cheers,

Karim.
--
Karim Belabas                     Tel: (+33) (0)1 69 15 57 48
Dép. de Mathématiques, Bât. 425   Fax: (+33) (0)1 69 15 60 19
Université Paris-Sud              http://www.math.u-psud.fr/~belabas/
F-91405 Orsay (France)            http://www.parigp-home.de/  [PARI/GP]
```