List Info

Thread: LAM: LAM/MPI under PBS




LAM: LAM/MPI under PBS
country flaguser name
Spain
2007-07-09 11:41:55
Hi all,

I'm trying to set a PBS cluster with 1 headnode and 1
client. Since it's a small
cluster the headnode is also a computing node. I've managed
to put it to work
with normal jobs (not MPI) but with MPI the jobs are all
sent to just the client
node.

the "lamnodes" directly on the shell correctly
reports both nodes but when run
from within PBS just the client is reported.

I've tried to run it with -v $PBS_NODEFILE but nothing. Does
this file has to
include all the computing nodes or just the client nodes? I
mean, do I include
both nodes or just the client?

i've also checked "laminfo" for the "ssi tm
boot" (pasted below) but it looks
like it is not supported. How can I install it?

thanks in adv
FG


	     LAM/MPI: 7.1.2
               Prefix: /usr
         Architecture: x86_64-redhat-linux-gnu
        Configured by: brewbuilder
        Configured on: Mon Jun 12 18:27:10 EDT 2006
       Configure host: ls20-bc2-14.build.redhat.com
       Memory manager: ptmalloc2
           C bindings: yes
         C++ bindings: yes
     Fortran bindings: yes
           C compiler: gcc
         C++ compiler: g++
     Fortran compiler: f95
      Fortran symbols: underscore
          C profiling: yes
        C++ profiling: yes
    Fortran profiling: yes
       C++ exceptions: no
       Thread support: yes
        ROMIO support: yes
         IMPI support: no
        Debug support: no
         Purify clean: no
             SSI boot: globus (API v1.1, Module v0.6)
             SSI boot: rsh (API v1.1, Module v1.1)
             SSI boot: slurm (API v1.1, Module v1.0)
             SSI coll: lam_basic (API v1.1, Module v7.1)
             SSI coll: shmem (API v1.1, Module v1.0)
             SSI coll: smp (API v1.1, Module v1.2)
              SSI rpi: crtcp (API v1.1, Module v1.1)
              SSI rpi: lamd (API v1.0, Module v7.1)
              SSI rpi: sysv (API v1.0, Module v7.1)
              SSI rpi: tcp (API v1.0, Module v7.1)
              SSI rpi: usysv (API v1.0, Module v7.1)
               SSI cr: self (API v1.0, Module v1.0)

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: LAM/MPI under PBS
country flaguser name
Germany
2007-07-09 11:57:17
On Mon, 9 Jul 2007, Filipe Garrett wrote:

> the "lamnodes" directly on the shell
correctly reports both nodes 
> but when run from within PBS just the client is
reported.

This means that you have started the LAM/MPI daemons on the
nodes.

But just to make sure: have you started lamd on the nodes
outside of 
PBS and expect to only run 'mpirun' from inside the PBS
batch script ?

> I've tried to run it with -v $PBS_NODEFILE but nothing.
Does this 
> file has to include all the computing nodes or just the
client 
> nodes? I mean, do I include both nodes or just the
client?

The LAM/MPI daemons have to be started on each node that you
plan to 
use for the MPI job. So, if you want to use both nodes, then
both 
their names should be in the file.

>        Configured on: Mon Jun 12 18:27:10 EDT 2006
>       Configure host: ls20-bc2-14.build.redhat.com

I guess that the installation of LAM/MPI that you are using
is the 
pre-packaged one from Red Hat (on some version of RHEL).
This is not 
configured with TM support, so the only chance for you to
have native 
PBS support in LAM/MPI is to get the LAM/MPI source and 
compile/install it yourself. But before doing that, please
remove the 
existing LAM/MPI installation to avoid any bad effects from
having 2 
versions installed side-by-side.

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches
Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg,
GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.CostescuIWR.Uni-Heidelberg.De
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: LAM/MPI under PBS
country flaguser name
Spain
2007-07-09 12:36:07

Bogdan Costescu wrote:
> On Mon, 9 Jul 2007, Filipe Garrett wrote:
> 
>> the "lamnodes" directly on the shell
correctly reports both nodes 
>> but when run from within PBS just the client is
reported.
> 
> This means that you have started the LAM/MPI daemons on
the nodes.
> 
> But just to make sure: have you started lamd on the
nodes outside of 
> PBS and expect to only run 'mpirun' from inside the PBS
batch script ?

No, I started "lamd" from inside the PBS script.

> 
>> I've tried to run it with -v $PBS_NODEFILE but
nothing. Does this 
>> file has to include all the computing nodes or just
the client 
>> nodes? I mean, do I include both nodes or just the
client?
> 
> The LAM/MPI daemons have to be started on each node
that you plan to 
> use for the MPI job. So, if you want to use both nodes,
then both 
> their names should be in the file.
> 

Ok

>>        Configured on: Mon Jun 12 18:27:10 EDT 2006
>>       Configure host: ls20-bc2-14.build.redhat.com
> 
> I guess that the installation of LAM/MPI that you are
using is the 
> pre-packaged one from Red Hat (on some version of
RHEL). This is not 
> configured with TM support, so the only chance for you
to have native 
> PBS support in LAM/MPI is to get the LAM/MPI source and

> compile/install it yourself. But before doing that,
please remove the 
> existing LAM/MPI installation to avoid any bad effects
from having 2 
> versions installed side-by-side.
> 

Thanks a lot for your help. Since it didn't support "tm
boot" I simply changed 
it to "rsh" and added the "-v nodefile"
option. And works perfectly!!!

thanks a lot!!!!
FG

PS - i'll leave it like this for now but later I'll give it
a try to the "tm boot"

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )