|
List Info
Thread: precomputed modular reductions in GMP?
|
|
| precomputed modular reductions in GMP? |

|
2006-08-26 22:25:26 |
Dear GMP developers,
I'm working on a fairly new computer algebra package called
SAGE that
uses GMP for the underlying arithmetic.
http://modul
ar.math.washington.edu/sage/
We have a class that represents the integers mod N. Whenever
two
elements of this ring are multiplied, the result must be
reduced mod
N. Currently this is implemented using mpz_fdiv_q().
However it occurred to me that if we are reducing modulo N
many
times, perhaps we should do some precomputation to speed up
all the
subsequent modular reductions. I wrote some code to try this
out (see
below): it basically precomputes an approximate inverse for
N, and
then does two multiplications and a few
bitshifts/subtractions to
extract the remainder.
For very large input (say over 150,000 bits), this speeds up
the
reductions by a factor of about two. The cross-over point
between
mpz_fdiv_q() and this algorithm is somewhere around the
10,000 -
40,000 bit range on my machine. For smaller inputs my
algorithm is
maybe twice as slow as just using mpz_fdiv_q() every time.
This cross-over point is quite disappointing. I had been
hoping to
find something more around the 1,000 bit mark.
I haven't been able to push the cross-over point down at
all from
this, even by diving into the mpn* functions. The difficulty
seems to
be the following. The algorithm would be most efficient if
it could
extract the high N bits of an NxN multiplication, and the
low N bits
of an NxN multiplication. When N is very large, it's just
as fast to
compute all the digits as it is to compute half the digits.
But for N
small, when GMP is using basecase multiplication, it should
be about
twice as fast to compute half the digits.
I'm wondering if there is any way to do these
"extract-half-the-
answer" multiplications in GMP? Or are there any plans
to implement it?
Or even better:
Are there any plans to add to GMP an interface for doing
modular
arithmetic with respect to a given base, where some kind of
precomputation is performed to speed up the reductions, like
I
suggest above? If there aren't any such plans, pretty
pretty please
add this to the to-do list ....
Thanks
/* Given A with N bits (i.e. 2^(N-1) <= A < 2^N),
computes Z such
that 2^(2N+1) = AZ + B, where 0 <= B < A. So Z is an
approximate
inverse for A. */
void prepare_preconditioned_reduction(mpz_t Z, mpz_t A,
unsigned long N)
{
mpz_set_ui(Z, 1);
mpz_mul_2exp(Z, Z, 2*N + 1); /* Z = 2^(2N+1) */
mpz_fdiv_q(Z, Z, A);
}
/* Computes X mod A, using some precomputed information.
Assumes:
* 2^(N-1) <= A < 2^N
* X is in the range 0 <= X < 2^(2N). For example X
could be the product
of two integers each bounded by A.
* Z is the value computed by
prepare_preconditioned_reduction.
*/
void preconditioned_reduction(mpz_t output, mpz_t X, mpz_t
A, mpz_t
Z, unsigned long N)
{
/* output = X >> (N-2) */
mpz_fdiv_q_2exp(output, X, N-2);
/* output = output * Z */
mpz_mul(output, output, Z);
/* output = output >> (N+3) */
mpz_fdiv_q_2exp(output, output, N+3);
/* output now holds the estimated quotient X / A. It
could
be too small by at most 1. */
/* output = output * A */
mpz_mul(output, output, A);
/* output = X - output */
mpz_sub(output, X, output);
/* subtract one last multiple of A if necessary */
if (mpz_cmp(output, A) >= 0)
mpz_sub(output, output, A);
}
David Harvey
_______________________________________________
gmp-discuss mailing list
gmp-discuss swox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
|
|
| precomputed modular reductions in GMP? |

|
2006-08-28 15:35:05 |
David Harvey <dmharvey math.harvard.edu>
writes:
However it occurred to me that if we are reducing modulo N
many
times, perhaps we should do some precomputation to speed
up all the
subsequent modular reductions. I wrote some code to try
this out (see
below): it basically precomputes an approximate inverse
for N, and
then does two multiplications and a few
bitshifts/subtractions to
extract the remainder.
That's one way of doing it. But using Montgomery
representation
(REDC) is usually better.
You should take a look at the the GMP development pages,
http://swox.com/
gmp/development.html, where we have some code for
inversion (plain truncating and 2-adic).
For very large input (say over 150,000 bits), this speeds
up the
reductions by a factor of about two. The cross-over point
between
mpz_fdiv_q() and this algorithm is somewhere around the
10,000 -
40,000 bit range on my machine. For smaller inputs my
algorithm is
maybe twice as slow as just using mpz_fdiv_q() every time.
This cross-over point is quite disappointing. I had been
hoping to
find something more around the 1,000 bit mark.
With plain "truncating" inverse and Barrett's
algorithm, that
might be difficult to achieve.
Even using REDC + Barrett will probably not help at 1000
bits.
I haven't been able to push the cross-over point down at
all from
this, even by diving into the mpn* functions. The
difficulty seems to
be the following. The algorithm would be most efficient if
it could
extract the high N bits of an NxN multiplication, and the
low N bits
of an NxN multiplication. When N is very large, it's just
as fast to
compute all the digits as it is to compute half the
digits. But for N
small, when GMP is using basecase multiplication, it
should be about
twice as fast to compute half the digits.
On the money. You should look for references to
"Mulder's
algorithm" for computing short products.
I'm wondering if there is any way to do these
"extract-half-the-
answer" multiplications in GMP? Or are there any
plans to implement it?
In current GMP. there is an *internal* mpn function called
mpn_mullow_n. The present code doesn't use Mulder's
algorithm, as the
cutoff point needs more thought than we had time for.
There isn't any corresponding mpn_mulhigh_n. Here, one
probably wants
several functions, one returning the exact answer, and one
which is
allowed to return a too low, but bounded, value.
mpn_mulhigh_n might
actually want to be mpn_mulhigh_mn, ie alowing operands of
different
sizes.
GMP 5 should have all these functions, but the exact
interface is not
known. And mpn_mullow_n will probably be renamed.
You're welcome to contribute in this area, but please start
with what
is already at the GMP development pages.
Are there any plans to add to GMP an interface for doing
modular
arithmetic with respect to a given base, where some kind
of
precomputation is performed to speed up the reductions,
like I
suggest above? If there aren't any such plans, pretty
pretty please
add this to the to-do list ....
Plans, but vague.
There will be functions at the mpn level with all needed
support, but
not sure about any higher level stuff.
--
Torbjörn
_______________________________________________
gmp-discuss mailing list
gmp-discuss swox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
|
|
| precomputed modular reductions in GMP? |

|
2006-08-28 19:54:53 |
On Aug 28, 2006, at 11:35 AM, Torbjorn Granlund wrote:
> You're welcome to contribute in this area, but please
start with what
> is already at the GMP development pages.
Wow.... after following up some of the links and algorithms
you
mentioned, I can see that an awful lot of clever people have
thought
about this question a LOT.
I would love to be able to contribute, but time is a bit
short right
now. I'll get back to you if that changes.
>> Are there any plans to add to GMP an interface
for doing modular
>> arithmetic with respect to a given base, where
some kind of
>> precomputation is performed to speed up the
reductions, like I
>> suggest above? If there aren't any such plans,
pretty pretty please
>> add this to the to-do list ....
>
> Plans, but vague.
>
> There will be functions at the mpn level with all
needed support, but
> not sure about any higher level stuff.
I did notice the following thread from about two years ago:
http://swox.com/list-archives/gmp-devel/2004-Fe
bruary/000382.html
Did these "mpz_prediv_t" structs get any
further?
It does seem sensible for GMP to include some kind of
higher-level
interface like this. I for one would be extremely happy if
an
interface was available in GMP 5, even if the code backing
it was not
that fantastic yet. It sounds like you guys have already
thought
quite a lot about the basic algorithms that could be used,
and it's
just a question of implementing and tuning them. In other
words, it
sounds like you already know enough about the structural
requirements
to design a good interface.
I think for now what I will do in SAGE is ensure that the
classes/
interfaces we design are compatible with introducing some
precomputation later on. I might throw in my simple version
for very
large integers just to get us started, since I've already
got the
code written, and then if something better comes along I can
throw
that in too.
We at SAGE are big fans of GMP. Keep up the good work!
David
_______________________________________________
gmp-discuss mailing list
gmp-discuss swox.com
http://s
wox.com/mailman/listinfo/gmp-discuss
|
|
[1-3]
|
|