List Info

Thread: precomputed modular reductions in GMP?




precomputed modular reductions in GMP?
user name
2006-08-26 22:25:26
Dear GMP developers,

I'm working on a fairly new computer algebra package called
SAGE that  
uses GMP for the underlying arithmetic.

http://modul
ar.math.washington.edu/sage/

We have a class that represents the integers mod N. Whenever
two  
elements of this ring are multiplied, the result must be
reduced mod  
N. Currently this is implemented using mpz_fdiv_q().

However it occurred to me that if we are reducing modulo N
many  
times, perhaps we should do some precomputation to speed up
all the  
subsequent modular reductions. I wrote some code to try this
out (see  
below): it basically precomputes an approximate inverse for
N, and  
then does two multiplications and a few
bitshifts/subtractions to  
extract the remainder.

For very large input (say over 150,000 bits), this speeds up
the  
reductions by a factor of about two. The cross-over point
between  
mpz_fdiv_q() and this algorithm is somewhere around the
10,000 -  
40,000 bit range on my machine. For smaller inputs my
algorithm is  
maybe twice as slow as just using mpz_fdiv_q() every time.

This cross-over point is quite disappointing. I had been
hoping to  
find something more around the 1,000 bit mark.

I haven't been able to push the cross-over point down at
all from  
this, even by diving into the mpn* functions. The difficulty
seems to  
be the following. The algorithm would be most efficient if
it could  
extract the high N bits of an NxN multiplication, and the
low N bits  
of an NxN multiplication. When N is very large, it's just
as fast to  
compute all the digits as it is to compute half the digits.
But for N  
small, when GMP is using basecase multiplication, it should
be about  
twice as fast to compute half the digits.

I'm wondering if there is any way to do these
"extract-half-the- 
answer" multiplications in GMP? Or are there any plans
to implement it?

Or even better:

Are there any plans to add to GMP an interface for doing
modular  
arithmetic with respect to a given base, where some kind of 

precomputation is performed to speed up the reductions, like
I  
suggest above? If there aren't any such plans, pretty
pretty please  
add this to the to-do list .... 

Thanks



/* Given A with N bits (i.e. 2^(N-1) <= A < 2^N),
computes Z such
that 2^(2N+1) = AZ + B, where 0 <= B < A. So Z is an
approximate
inverse for A. */
void prepare_preconditioned_reduction(mpz_t Z, mpz_t A,
unsigned long N)
{
     mpz_set_ui(Z, 1);
     mpz_mul_2exp(Z, Z, 2*N + 1);   /* Z = 2^(2N+1) */
     mpz_fdiv_q(Z, Z, A);
}


/* Computes X mod A, using some precomputed information.

Assumes:
* 2^(N-1) <= A < 2^N
* X is in the range 0 <= X < 2^(2N). For example X
could be the product
   of two integers each bounded by A.
* Z is the value computed by
prepare_preconditioned_reduction.
*/
void preconditioned_reduction(mpz_t output, mpz_t X, mpz_t
A, mpz_t  
Z, unsigned long N)
{
     /*  output = X >> (N-2)  */
     mpz_fdiv_q_2exp(output, X, N-2);

     /*  output = output * Z  */
     mpz_mul(output, output, Z);

     /*  output = output >> (N+3)  */
     mpz_fdiv_q_2exp(output, output, N+3);

     /* output now holds the estimated quotient X / A. It
could
     be too small by at most 1. */

     /*  output = output * A  */
     mpz_mul(output, output, A);

     /*  output = X - output  */
     mpz_sub(output, X, output);

     /* subtract one last multiple of A if necessary */
     if (mpz_cmp(output, A) >= 0)
         mpz_sub(output, output, A);
}




David Harvey


_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
precomputed modular reductions in GMP?
user name
2006-08-28 15:35:05
David Harvey <dmharveymath.harvard.edu>
writes:

  However it occurred to me that if we are reducing modulo N
many  
  times, perhaps we should do some precomputation to speed
up all the  
  subsequent modular reductions. I wrote some code to try
this out (see  
  below): it basically precomputes an approximate inverse
for N, and  
  then does two multiplications and a few
bitshifts/subtractions to  
  extract the remainder.
  
That's one way of doing it.  But using Montgomery
representation
(REDC) is usually better.

You should take a look at the the GMP development pages,
http://swox.com/
gmp/development.html, where we have some code for
inversion (plain truncating and 2-adic).

  For very large input (say over 150,000 bits), this speeds
up the  
  reductions by a factor of about two. The cross-over point
between  
  mpz_fdiv_q() and this algorithm is somewhere around the
10,000 -  
  40,000 bit range on my machine. For smaller inputs my
algorithm is  
  maybe twice as slow as just using mpz_fdiv_q() every time.
  
  This cross-over point is quite disappointing. I had been
hoping to  
  find something more around the 1,000 bit mark.
  
With plain "truncating" inverse and Barrett's
algorithm, that
might be difficult to achieve.

Even using REDC + Barrett will probably not help at 1000
bits.

  I haven't been able to push the cross-over point down at
all from  
  this, even by diving into the mpn* functions. The
difficulty seems to  
  be the following. The algorithm would be most efficient if
it could  
  extract the high N bits of an NxN multiplication, and the
low N bits  
  of an NxN multiplication. When N is very large, it's just
as fast to  
  compute all the digits as it is to compute half the
digits. But for N  
  small, when GMP is using basecase multiplication, it
should be about  
  twice as fast to compute half the digits.
  
On the money.  You should look for references to
"Mulder's
algorithm" for computing short products.

  I'm wondering if there is any way to do these
"extract-half-the- 
  answer" multiplications in GMP? Or are there any
plans to implement it?
  
In current GMP. there is an *internal* mpn function called
mpn_mullow_n.  The present code doesn't use Mulder's
algorithm, as the
cutoff point needs more thought than we had time for.

There isn't any corresponding mpn_mulhigh_n.  Here, one
probably wants
several functions, one returning the exact answer, and one
which is
allowed to return a too low, but bounded, value. 
mpn_mulhigh_n might
actually want to be mpn_mulhigh_mn, ie alowing operands of
different
sizes.

GMP 5 should have all these functions, but the exact
interface is not
known.  And mpn_mullow_n will probably be renamed.

You're welcome to contribute in this area, but please start
with what
is already at the GMP development pages.

  Are there any plans to add to GMP an interface for doing
modular  
  arithmetic with respect to a given base, where some kind
of  
  precomputation is performed to speed up the reductions,
like I  
  suggest above? If there aren't any such plans, pretty
pretty please  
  add this to the to-do list .... 
  
Plans, but vague.

There will be functions at the mpn level with all needed
support, but
not sure about any higher level stuff.

-- 
Torbjörn
_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
precomputed modular reductions in GMP?
user name
2006-08-28 19:54:53
On Aug 28, 2006, at 11:35 AM, Torbjorn Granlund wrote:

> You're welcome to contribute in this area, but please
start with what
> is already at the GMP development pages.

Wow.... after following up some of the links and algorithms
you  
mentioned, I can see that an awful lot of clever people have
thought  
about this question a LOT.

I would love to be able to contribute, but time is a bit
short right  
now. I'll get back to you if that changes.


>>   Are there any plans to add to GMP an interface
for doing modular
>>   arithmetic with respect to a given base, where
some kind of
>>   precomputation is performed to speed up the
reductions, like I
>>   suggest above? If there aren't any such plans,
pretty pretty please
>>   add this to the to-do list .... 
>
> Plans, but vague.
>
> There will be functions at the mpn level with all
needed support, but
> not sure about any higher level stuff.

I did notice the following thread from about two years ago:

http://swox.com/list-archives/gmp-devel/2004-Fe
bruary/000382.html

Did these "mpz_prediv_t" structs get any
further?

It does seem sensible for GMP to include some kind of
higher-level  
interface like this. I for one would be extremely happy if
an  
interface was available in GMP 5, even if the code backing
it was not  
that fantastic yet. It sounds like you guys have already
thought  
quite a lot about the basic algorithms that could be used,
and it's  
just a question of implementing and tuning them. In other
words, it  
sounds like you already know enough about the structural
requirements  
to design a good interface.

I think for now what I will do in SAGE is ensure that the
classes/ 
interfaces we design are compatible with introducing some  
precomputation later on. I might throw in my simple version
for very  
large integers just to get us started, since I've already
got the  
code written, and then if something better comes along I can
throw  
that in too.

We at SAGE are big fans of GMP. Keep up the good work!

David

_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )