List Info

Thread: LAM: Problem with LSDyna and LAM




LAM: Problem with LSDyna and LAM
country flaguser name
Australia
2008-03-29 12:17:32
Hello
 
I have a problem running LSDyna with LAM-MPI 7.0.3. I am using precompiled LSDyna binaries with LAM 7.0.3. When I run the job using just one node, it runs fine. But if i run the job over the network on 2 machines, it fails giving an error
 
"It seems that[at least] one of the processes that was started with mpirun did not invoke MPI_INIT before quitting
  (it is possible that more than one process did not invoke MPI_INIT -- mpirun was only notified of the first one, which was on node n0"
 
Can you please let me know what is the problem.
 
Thanks in advance,
Regards,
Jigar 


Looking for last minute shopping deals? Find them fast with Yahoo! Search.
Re: LAM: Problem with LSDyna and LAM
user name
2008-03-29 13:05:00
Hello Jigar

-does each cluster node have one or two network interfaces
?

keep in mind that for network connection on both nodes
lam/mpi must have the same opinion over which interface
to connect.

Translated: the hostnames in the lamhost file must resolve
            to the same network !

Are the names in your hostfile are generated via the exec
host list of
a batch system like PBS /  SGE / LSF ?


-2nd trap: in case you do not have a NSF shared working
directory
           as common working directory for the calculation
           the following recipe will help to resolve the
real
           problem easier:

mkdir -p /scratch/mydynajob on both nodes !
copy all input inclusive lamhosts file to both nodes
then start the job

dyna + lam7.0.3 works perfectly well and easy ...
so probably your problem ist a network / routing problem
of two prcesses not talking over the sanme interface

in case of problems verify also that the 2 CPU / single node
job
runs on both nodes, so per se both nodes are configured ok

hth
Micha







On Sat, Mar 29, 2008 at 10:17:32AM -0700, Jigar Halani
wrote:
> Hello 
>    
>   I have a problem running LSDyna with LAM-MPI 7.0.3. I
am using precompiled LSDyna binaries with LAM 7.0.3. When I
run the job using just one node, it runs fine. But if i run
the job over the network on 2 machines, it fails giving an
error 
>    
>   "It seems that[at least] one of the processes
that was started with mpirun did not invoke MPI_INIT before
quitting
>   (it is possible that more than one process did not
invoke MPI_INIT -- mpirun was only notified of the first
one, which was on node n0"
>    
>   Can you please let me know what is the problem.
>    
>   Thanks in advance,
>   Regards,
>   Jigar 
> 
>        
> ---------------------------------
> Looking for last minute shopping deals?  Find them fast
with Yahoo! Search.
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
-- 
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Florian Geyer,
Dr. Roland Niemeier, Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Prof. Dr. Hanns Ruder
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196 


_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )