Karim BELABAS on Tue, 1 Oct 2002 17:05:27 +0200 (MEST)

```On Thu, 26 Sep 2002, Karim BELABAS wrote:
[...Following up to my own post...]
> On the other hand, currently:
>
> 1) t_REAL are constantly shifted back and forth (on average 0.5 shifts per
> multiplication, 1.25 shifts per addition, 1.5 shifts per subtraction)

Sorry, this is not true: ~1.5 shifts per addition/subtraction on average

> 2) The code in addrr() becomes quite involved (read: inefficient) to prevent
> uncontrolable precision increase like the above. A spurious new word of
> precision is added each time we shift even a single bit (provided the shift
> is not a multiple of the word length). We have to remember this, and possibly
> compensate before returning (if final normalizing shift offsets the initial
> one). For short inputs (say 2 or 3 words), shifts (and auxiliary code) might
> even dominate the addition time.

3) switching to base 2^BITS_IN_LONG exponents would enable us to reuse the
integer multiplication code for floating point operations [ currently, they
are disjoint sets of routines ]. In particular fast multiplication would be
enabled at last for t_REALs.

4) Current representation cause different behaviour on 32-bits and 64-bits
machines: on the latter, shifting (even by 1 bit) introduces 64 extra bits of
(mostly) bogus accuracy, not 32.

Consequence of 4): routines trying to heuristically decide whether something
is non-zero ( by comparing bit_accuracy( lg(x) ) with the relative exponent
of x compared to the input size ), make weird mistakes at low precision. An
example from the 'linear' bench:

matker(matrix(4,4,x,y,sin(x+y)))

had rank 2 on 32-bits machines (correct) and rank 1 (wrong) on 64-bits ones.

===========================================================================

The current algorithm for subtraction of real numbers is:

* if exponents differ, shift one of the input t_REAL by the difference. It now
has a certain number of significant bits, split into a number of entire
words, plus an extra word, which has 0 <= m < BITS_IN_LONG significant bits.

* at the end, if we have cancellations, normalize the result, shifting it to
the left by a number of bits, splitting into an entire number of words + sh
bits (0 <= sh < BITS_IN_LONG).

* So the number of significant bits remaining in the extra word we added is
at most m - sh. If this is <= 2, I cancel the extra word and round
appropriately.

'2' is arbitrary. '0' is completely safe (we cancel nothing but bogus bits).
Could make it 1, 2, 3 ... [ here we do cancel a few correct bits ].

As it stands, some routines lose precision (a little bit) faster. But it
doesn't seem to break anything, and it fixes a few problems like the above.

Of course, modifying the representation of t_REALs (to either allow
specifying the precision in bits, or prevent shifting by a non-integral
number of words) would be a much better fix.

Karim.
--
Karim Belabas                    Tel: (+33) (0)1 69 15 57 48
Dép. de Mathematiques, Bat. 425  Fax: (+33) (0)1 69 15 60 19
Université Paris-Sud             Email: Karim.Belabas@math.u-psud.fr
F-91405 Orsay (France)           http://www.math.u-psud.fr/~belabas/
--