On Sep 7, 2007, at 3:27 PM, pat.o'bryant exxonmobil.com wrote:
> We have a group of users that uses
"lamexec" along with an
> application
> schema file to execute MPMD jobs. Since we upgraded to
"lam-7.1.2.8",
> "lamexec" fails with the messages shown
below. Interesting, the
> users guide
> for "lam 7.1.2" has the following text:
>
> "The lamexec command is similar to mpirun but is
used for non-MPI
> programs
> "
> So, the question is this: "What version(s) of LAM
support
> "lamexec"? Our
> earlier version of LAM, "lam-6.5.8-4",
worked just fine using
> "lamexec".
Well that's quite odd -- lamexec should work for *all*
versions of LAM.
> Code that generated Error Messages
> *********************************************
> .......
> lamboot -v /tmp/lam_boot.$PBS_JOBID
Note that in the 7.x series, you shouldn't need the boot
schema file
(indeed, it's ignored). LAM will directly obtain the list
of hosts
to use from PBS/Torque.
> lamexec -w -v schema1
What is the contents of the schema1 file?
FWIW: I just downloaded and installed LAM 7.1.4 and lamexec
seemed to
work for me:
-----
[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamboot
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University
[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamexec N
hostname
svbu-mpi.cisco.com
[5:49] svbu-mpi:/home/jsquyres/lam-7.1.4 % cat schema
N hostname
[5:50] svbu-mpi:/home/jsquyres/lam-7.1.4 % lamexec -w -v
schema
6442 hostname running on n0 (o)
svbu-mpi.cisco.com
[5:50] svbu-mpi:/home/jsquyres/lam-7.1.4 %
-----
> Error Messages
> *********************
> n-1<24782> ssi:boot:base:linear: booting n0
(xxxxxxxxxx)
> n-1<24782> ssi:boot:base:linear: booting n1
(yyyyyyyyy)
> n-1<24782> ssi:boot:base:linear: finished
>
------------------------------------------------------------
----------
> -------
> It seems that [at least] one of the processes that was
started with
> mpirun did not invoke MPI_INIT before quitting (it is
possible that
> more than one process did not invoke MPI_INIT -- mpirun
was only
> notified of the first one, which was on node n0).
>
> mpirun can *only* be used with MPI programs (i.e.,
programs that
> invoke MPI_INIT and MPI_FINALIZE). You can use the
"lamexec" program
> to run non-MPI programs over the lambooted nodes.
>
------------------------------------------------------------
----------
> -------
>
>
> J.W. (Pat) O'Bryant,Jr.
> Business Line Infrastructure
> Technical Systems, HPC
> Office: 713-431-7022
>
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
--
Jeff Squyres
Cisco Systems
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|