List Info

Thread: GMP on the Cell processor




GMP on the Cell processor
user name
2007-04-17 14:15:26
  Has anyone on this list thought of porting GMP to the Cell
processor?
  Would anyone be interested in using GMP if it was on the
Cell?

I think it would be tricky to make a port that would do
justice to the
Cell.  The parallelism in GMP is fine-grained, at least as
long as the
operands are not really huge.

If you want to work on a GMP Cell port, please start by
studying how
to write a plain basecase multiplication (mpn_mul_basecase)
for the
processor.  The divide-and-conquer structure of most
algorithms in GMP
will make use of of mpn_mul_basecase.

  On a similar note, is there a version of GMP for the
Altivec
  which is this vector extension for some (but not all)
PowerPC
  CPU's ?

GMP 4.2 makes use of Altivec when it is available.  (There
are some
flaws in configure that made the altivec recognition
unreliable,
though.)

-- 
Torbjörn
_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss

Re: GMP on the Cell processor
user name
2007-04-17 15:16:07
Torbjorn Granlund wrote:
>   On a similar note, is there a version of GMP for the
Altivec
>   which is this vector extension for some (but not all)
PowerPC
>   CPU's ?
>
> GMP 4.2 makes use of Altivec when it is available. 
(There are some
> flaws in configure that made the altivec recognition
unreliable,
> though.)
>

The Cell is a hyprid of a dual Core PowerCore (in order) and
up to 8 SPEs
(also in order). The great performance for numerical
applications comes
from the fact that you can achieve up to 200 GFlops *single*
precision and
about 1/14 of that for *double* precision with hand tuned
code that uses
all the SPEs in parallel. The Power Core just feeds the
SPEs. The SPEs
also supposedly provide integer operations and because each
SPE has 128
128bit registers there is certainly potential. Cell blades
are rather
costly (dual Cell from IBM at around ~15K$), but you can get
a PCI-Express
card with a single Cell for around 4K$. To play around there
is also an
emulator (free as in beer) and a toolchain based on gcc
(free as in
freedom ;) that lets you play around provided you have some
decent
hardware (1.5GHZ PPC, x86, x86-64, 1+G Ram). The emulator
supposedly gives
you something that is very close (~2%) to the physical CPU
because it also
takes into consideration DMA transfers, cache misses and all
the other fun
stuff.

I am currently starting to play around with the emulator for
some
numerical  project group I am working with, but provided
somebody does
start to port I would certainly be willing to test &
debug and maybe code
a little. I will also have (limited) access to some real
hardware in the
future. Because I also do work on other open source projects
I do not have
whole lot of time to do so.

If you own a PS3 you can install Linux and have access to 6
SPEs because
the hypervisor of the PS3 only gives you access to those 6
while the 7th
is reserved for hypervisor. But it is a relatively cheap way
to get one's
hands on some real Cell hardware.

Hope that helps,

Michael

> --
> Torbjörn
> _______________________________________________
> gmp-discuss mailing list
> gmp-discussswox.com
> http://s
wox.com/mailman/listinfo/gmp-discuss
>


_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss

Re: GMP on the Cell processor
user name
2007-04-17 15:18:22
On Tue, 2007-04-17 at 20:15, Torbjorn Granlund wrote:
>   Has anyone on this list thought of porting GMP to the
Cell processor?
>   Would anyone be interested in using GMP if it was on
the Cell?
> 
> I think it would be tricky to make a port that would do
justice to the
> Cell.  The parallelism in GMP is fine-grained, at least
as long as the
> operands are not really huge.

I've been giving it some thought recently, but the pressure
of Real Life
(TM) and, especially, Real Work has been such that the
thought hasn't
got very far yet.

Your (Torbjorn) analysis is accurate but not the whole
story, IMO.

My interest is primarily in integer factorization, several
algorithms
for which are trivially parallelizable and computationally
demanding (as
opposed to memory-intensive).  ECM is probably the best
example but
there are others, including some which are subroutines for
other
algorithms such as NFS.  Other algorithms are not entirely
trivial to
parallelize but the Chinese Remainder Theorem provides an
obvious entry
point into their parallelization.

Running 7 copies of stage 1 of ECM suimultaneously, each
with the same N
but different curves, on the SPUs of a PS3 using their local
memory is
*very* attractive.   Their second stages, which are very
memory-hungry,
would then be farmed out either to the main PPC or to other
more
conventional machines.

GMP for the SPU, a stripped down version if necessary to get
it to fit
in the limited memory, would be very welcome indeed.


Paul

_______________________________________________
gmp-discuss mailing list
gmp-discussswox.com
http://s
wox.com/mailman/listinfo/gmp-discuss

[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )