List Info

Thread: LAM: lamexec and lam-7.1.2-8




LAM: lamexec and lam-7.1.2-8
country flaguser name
2007-09-07 14:27:09
Hello,
    We have a group of users that uses "lamexec"
along with an application
schema file to execute MPMD jobs. Since we upgraded to
"lam-7.1.2.8",
"lamexec" fails with the messages shown below.
Interesting, the users guide
for "lam 7.1.2" has the following text:
                                                            
               
 "The lamexec command is similar to mpirun but is used
for non-MPI programs 
 "                                                     
                    
 So, the question is this: "What version(s) of LAM
support "lamexec"?  Our  
 earlier version of LAM, "lam-6.5.8-4", worked
just fine using "lamexec".   
                                                            
               
         Thanks,                                            
               
          Pat O'Bryant                                      
               
                                                            
               



Code that generated Error Messages
*********************************************
.......
lamboot -v /tmp/lam_boot.$PBS_JOBID
lamexec -w -v schema1

Error Messages
*********************
n-1<24782> ssi:boot:base:linear: booting n0
(xxxxxxxxxx)
n-1<24782> ssi:boot:base:linear: booting n1
(yyyyyyyyy)
n-1<24782> ssi:boot:base:linear: finished
------------------------------------------------------------
-----------------
It seems that [at least] one of the processes that was
started with
mpirun did not invoke MPI_INIT before quitting (it is
possible that
more than one process did not invoke MPI_INIT -- mpirun was
only
notified of the first one, which was on node n0).

mpirun can *only* be used with MPI programs (i.e., programs
that
invoke MPI_INIT and MPI_FINALIZE).  You can use the
"lamexec" program
to run non-MPI programs over the lambooted nodes.
------------------------------------------------------------
-----------------


J.W. (Pat) O'Bryant,Jr.
Business Line Infrastructure
Technical Systems, HPC
Office: 713-431-7022

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: lamexec and lam-7.1.2-8
user name
2007-09-11 07:51:02
On Sep 7, 2007, at 3:27 PM, pat.o'bryantexxonmobil.com wrote:

>     We have a group of users that uses
"lamexec" along with an  
> application
> schema file to execute MPMD jobs. Since we upgraded to
"lam-7.1.2.8",
> "lamexec" fails with the messages shown
below. Interesting, the  
> users guide
> for "lam 7.1.2" has the following text:
>
>  "The lamexec command is similar to mpirun but is
used for non-MPI  
> programs
>  "
>  So, the question is this: "What version(s) of LAM
support  
> "lamexec"?  Our
>  earlier version of LAM, "lam-6.5.8-4",
worked just fine using  
> "lamexec".

Well that's quite odd -- lamexec should work for *all*
versions of LAM.

> Code that generated Error Messages
> *********************************************
> .......
> lamboot -v /tmp/lam_boot.$PBS_JOBID

Note that in the 7.x series, you shouldn't need the boot
schema file  
(indeed, it's ignored).  LAM will directly obtain the list
of hosts  
to use from PBS/Torque.

> lamexec -w -v schema1

What is the contents of the schema1 file?

FWIW: I just downloaded and installed LAM 7.1.4 and lamexec
seemed to  
work for me:

-----
[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamboot

LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University

[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamexec N
hostname
svbu-mpi.cisco.com
[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % cat schema
N hostname
[5:50] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamexec -w -v
schema
6442 hostname running on n0 (o)
svbu-mpi.cisco.com
[5:50] svbu-mpi:/home/jsquyres/lam-7.1.4 %
-----

> Error Messages
> *********************
> n-1<24782> ssi:boot:base:linear: booting n0
(xxxxxxxxxx)
> n-1<24782> ssi:boot:base:linear: booting n1
(yyyyyyyyy)
> n-1<24782> ssi:boot:base:linear: finished
>
------------------------------------------------------------
---------- 
> -------
> It seems that [at least] one of the processes that was
started with
> mpirun did not invoke MPI_INIT before quitting (it is
possible that
> more than one process did not invoke MPI_INIT -- mpirun
was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e.,
programs that
> invoke MPI_INIT and MPI_FINALIZE).  You can use the
"lamexec" program
> to run non-MPI programs over the lambooted nodes.
>
------------------------------------------------------------
---------- 
> -------
>
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/


-- 
Jeff Squyres
Cisco Systems

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )