I'm experiencing a problem with LAM-MPI. This occurs with
both the 7.1.1 package
in Debian unstable as well as with 7.1.3 self compiled on
Kubuntu Edgy 64 bit
version. The compile switches I use are
./configure --enable-shared --disable-static --with-modules
--with-trillium
Then I'm using MPITB for GNU Octave, compiled against either
of the two versions
of LAM/MPI. I run a script which performs a task using an
increasingly large
data set, using from 1 to 4 nodes (the cluster is made of 2
machines, each of
which has 2 Xeon 64 bit processors running at 3.6 GHz). The
output I get is as
follows, with the error at the end.
############################################################
############
kernel regression example with several sample sizes
serial/parallel timings
4000 data points and 1 compute nodes: 2.359059
4000 data points and 2 compute nodes: 1.804235
4000 data points and 3 compute nodes: 1.578466
4000 data points and 4 compute nodes: 1.815341
8000 data points and 1 compute nodes: 8.486310
8000 data points and 2 compute nodes: 4.810935
8000 data points and 3 compute nodes: 3.553705
8000 data points and 4 compute nodes: 3.292904
10000 data points and 1 compute nodes: 12.804898
10000 data points and 2 compute nodes: 7.040623
10000 data points and 3 compute nodes: 5.084657
10000 data points and 4 compute nodes: 4.369931
12000 data points and 1 compute nodes: 18.254475
12000 data points and 2 compute nodes: 10.608161
12000 data points and 3 compute nodes: 6.967206
12000 data points and 4 compute nodes: 5.812930
16000 data points and 1 compute nodes: 34.901709
16000 data points and 2 compute nodes: 18.662503
16000 data points and 3 compute nodes: 13.253614
16000 data points and 4 compute nodes: 10.133724
20000 data points and 1 compute nodes: 60.665225
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Intercomm_merge()
Rank (0, MPI_COMM_WORLD): - MPI_Comm_spawn()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Intercomm_merge: internal MPI error: out of descriptors
(rank 0, comm 82)
MPI_Intercomm_merge: internal MPI error: out of descriptors
(rank 0,
MPI_COMM_PARENT)
michael parallelknoppix1:~/Octave/Econometrics/Parallel/kerne
l$
So the script works ok up to a point. This is reproducible -
it always happens
when the problem gets large enough. Any ideas what the
problem might be? Thanks,
Michael
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|