|
List Info
Thread: LAM: LAM-MPI task placement on an SMP system
|
|
| LAM: LAM-MPI task placement on an SMP
system |

|
2007-02-02 04:07:39 |
Hello,
I tried the following to place LAM-MPI tasks on an SMP
system.
lamboot -v -ssi rpi usysv ./hostfile
with cat ./hostfile
toto cpu=16
Then I tried to launch an application
mpirun c8-15 ./a.out
But it did not work, LAM-MPI tasks were not running on cpus
8-15.
Did I do something wrong?
Does LMA-MPI placement work on SMP system?
I am using lam-7.1.2
Charles
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |

|
2007-02-02 05:02:59 |
Hi Charles,
Charles ROGE ha scritto:
> Hello,
>
> I tried the following to place LAM-MPI tasks on an SMP
system.
>
> lamboot -v -ssi rpi usysv ./hostfile
>
> with cat ./hostfile
> toto cpu=16
>
> Then I tried to launch an application
> mpirun c8-15 ./a.out
>
> But it did not work, LAM-MPI tasks were not running on
cpus 8-15.
What do you mean exactly by "tasks were not running on
cpus 8-15":
nothing is running at all, or the task are not running on
the requested
CPUs but on the others (0-7, I guess)?
If the first is true, then you probably have to check your
LAM-MPI
installation or your application.
If the second holds, then your request is pointless, in my
knowledge,
LAM does not make anything particular to attach processes to
CPUs in a
SMP system, it just starts as many processes as requested,
then it is up
to the operating system to balance them among the available
processors,
this is the essence of Symmetric Multi Processing; AFAIK,
there is no
such a concept (and no need too) of starting a process on a
particular
CPU in a plain SMP system.
If you are using the Linux kernel, then recent versions
should have a
tunable scheduler which tries to attach processes to CPUs as
much as
possible (the so-called CPU affinity) to improve performance
on SMP, but
it is not guaranteed either that a given process will always
run on the
same CPU.
If you have a NUMA (Non Uniform Memory Access) system, then
things are
more complex, but I have no direct experience of that.
Hope this helps, Davide
>
> Did I do something wrong?
> Does LMA-MPI placement work on SMP system?
>
> I am using lam-7.1.2
>
>
> Charles
>
>
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
>
--
__________________________________________________________
Davide Cesari ARPA-Servizio Idro Meteorologico __
tel (39) 051/525926 ||
fax (39) 051/6497501 |||
e-mail dcesari arpa.emr.it |||/
www http://www.arpa.emr.it/sim
---
Address: ARPA-SIM, Viale Silvani 6, 40122 Bologna, Italy
__________________________________________________________
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |

|
2007-02-02 06:45:15 |
On Feb 2, 2007, at 6:02 AM, Davide Cesari wrote:
> If the second holds, then your request is pointless, in
my knowledge,
> LAM does not make anything particular to attach
processes to CPUs in a
> SMP system, it just starts as many processes as
requested, then it
> is up
> to the operating system to balance them among the
available
> processors,
This is correct. LAM simply starts up the Right number of
processes
and does not bind them to any particular CPUs.
> this is the essence of Symmetric Multi Processing;
AFAIK, there is no
> such a concept (and no need too) of starting a process
on a particular
> CPU in a plain SMP system.
> If you are using the Linux kernel, then recent
versions should have a
> tunable scheduler which tries to attach processes to
CPUs as much as
> possible (the so-called CPU affinity) to improve
performance on
> SMP, but
> it is not guaranteed either that a given process will
always run on
> the
> same CPU.
> If you have a NUMA (Non Uniform Memory Access) system,
then things
> are
> more complex, but I have no direct experience of that.
FWIW, Open MPI has some basic processor and memory affinity
mechanisms. Right now, the only mechanism available
described here:
http://www.open-mpi.org/faq/?category=tuning#paffinit
y-defs
http://www.open-mpi.org/faq/?category=tuning#maffinit
y-defs
http://www.open-mpi.org/faq/?category=tuning#using-p
affinity
We don't offer more fine-grained mechanisms [yet] mainly
because
we've had a heck of a time trying to define a decent syntax
for
allowing users to specify the exact placement that they
want. It
sounds like a silly problem, but it turns into really nasty
details
that are quite complex. :-(
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |

|
2007-02-02 10:20:12 |
On Friday 02 February 2007 14:36, Tim Prince wrote:
> jsquyres cisco.com wrote:
> > On Feb 2, 2007, at 6:02 AM, Davide Cesari wrote:
> >> If the second holds, then your request is
pointless, in my knowledge,
> >> LAM does not make anything particular to
attach processes to CPUs in a
> >> SMP system, it just starts as many processes
as requested, then it
> >> is up
> >> to the operating system to balance them among
the available
> >> processors,
> >
> > This is correct. LAM simply starts up the Right
number of processes
> > and does not bind them to any particular CPUs.
> >
> >> this is the essence of Symmetric Multi
Processing; AFAIK, there is no
> >> such a concept (and no need too) of starting a
process on a particular
> >> CPU in a plain SMP system.
> >>
> >> If you are using the Linux kernel, then
recent versions should have a
> >> tunable scheduler which tries to attach
processes to CPUs as much as
> >> possible (the so-called CPU affinity) to
improve performance on
> >> SMP, but
> >> it is not guaranteed either that a given
process will always run on
> >> the
> >> same CPU.
> >> If you have a NUMA (Non Uniform Memory
Access) system, then things
> >> are
> >> more complex, but I have no direct experience
of that.
>
> Most recent linux versions include a useful taskset
command:
> mpirun -np 8 taskset -c 8-15 ./a.out
> which should be fairly effective at placing your
processes on that group
> of processors within each node. The purpose of using
taskset usually is
> to improve efficiency through cache or NUMA memory
affinity, but it
> could be used to do what OP appears to be requesting.
Sorry, I must be missing something, but shouldn't this be
something the OS
does? I think I recall that last time I recompiled a Linux
kernel (a 2.6 one,
for AMD Opteron machine, about 6 months ago?) there was
stuff related to
NUMA. I'd feel better if someone doing kernel development
takes care of this
rather than having this responsibility myself .
Best,
R.
> The latest schedulers, included in RHEL4_U4 and CentOS
4.4, generally
> accomplish efficient scheduling without requiring
taskset, but that
> doesn't appear to be what OP is asking.
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
--
Ramón Díaz-Uriarte
Centro Nacional de Investigaciones Oncológicas (CNIO)
(Spanish National Cancer Center)
Melchor Fernández Almagro, 3
28029 Madrid (Spain)
Fax: +-34-91-224-6972
Phone: +-34-91-224-6900
http://ligarto.org/rdiaz
PGP KeyID: 0xE89B3462
(http://ligart
o.org/rdiaz/0xE89B3462.asc)
**NOTA DE CONFIDENCIALIDAD** Este correo electrónico, y en
su caso los ficheros adjuntos, pueden contener información
protegida para el uso exclusivo de su destinatario. Se
prohíbe la distribución, reproducción o cualquier otro tipo
de transmisión por parte de otra persona que no sea el
destinatario. Si usted recibe por error este correo, se
ruega comunicarlo al remitente y borrar el mensaje
recibido.
**CONFIDENTIALITY NOTICE** This email communication and any
attachments may contain confidential and privileged
information for the sole use of the designated recipient
named above. Distribution, reproduction or any other use of
this transmission by any party other than the intended
recipient is prohibited. If you are not the intended
recipient please contact the sender and delete all copies.
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |
  United States |
2007-02-05 11:37:29 |
On Fri, 2 Feb 2007, Ramon Diaz-Uriarte wrote:
> On Friday 02 February 2007 14:36, Tim Prince wrote:
>> jsquyres cisco.com wrote:
>>> On Feb 2, 2007, at 6:02 AM, Davide Cesari
wrote:
>>>> If the second holds, then your request is
pointless, in my knowledge,
>>>> LAM does not make anything particular to
attach processes to CPUs in a
>>>> SMP system, it just starts as many
processes as requested, then it
>>>> is up
>>>> to the operating system to balance them
among the available
>>>> processors,
>>>
>>> This is correct. LAM simply starts up the
Right number of processes
>>> and does not bind them to any particular CPUs.
>>>
>>>> this is the essence of Symmetric Multi
Processing; AFAIK, there is no
>>>> such a concept (and no need too) of
starting a process on a particular
>>>> CPU in a plain SMP system.
>>>>
>>>> If you are using the Linux kernel, then
recent versions should have a
>>>> tunable scheduler which tries to attach
processes to CPUs as much as
>>>> possible (the so-called CPU affinity) to
improve performance on
>>>> SMP, but
>>>> it is not guaranteed either that a given
process will always run on
>>>> the
>>>> same CPU.
>>>> If you have a NUMA (Non Uniform Memory
Access) system, then things
>>>> are
>>>> more complex, but I have no direct
experience of that.
>>
>> Most recent linux versions include a useful taskset
command:
>> mpirun -np 8 taskset -c 8-15 ./a.out
>> which should be fairly effective at placing your
processes on that group
>> of processors within each node. The purpose of
using taskset usually is
>> to improve efficiency through cache or NUMA memory
affinity, but it
>> could be used to do what OP appears to be
requesting.
>
> Sorry, I must be missing something, but shouldn't this
be something the OS
> does? I think I recall that last time I recompiled a
Linux kernel (a 2.6 one,
> for AMD Opteron machine, about 6 months ago?) there was
stuff related to
> NUMA. I'd feel better if someone doing kernel
development takes care of this
> rather than having this responsibility myself .
You're only missing that Computers Suck (IMHO) . There is
an awful lot
of code in the Linux kernel to try to make NUMA machines
more tolerable.
But it has it's limitations -- it's designed to provide the
best overall
machine "responsiveness", not the lowest latency
for 2 of the 80 processes
running on the machine. MPI apps tend to want the second
one -- a very
few processes should be privledged over all others.
Brian
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |

|
2007-02-05 14:10:25 |
On 2/5/07, Brian W. Barrett <brbarret lam-mpi.org> wrote:
> On Fri, 2 Feb 2007, Ramon Diaz-Uriarte wrote:
>
> > On Friday 02 February 2007 14:36, Tim Prince
wrote:
> >> jsquyres cisco.com wrote:
> >>> On Feb 2, 2007, at 6:02 AM, Davide Cesari
wrote:
> >>>> If the second holds, then your request
is pointless, in my knowledge,
> >>>> LAM does not make anything particular
to attach processes to CPUs in a
> >>>> SMP system, it just starts as many
processes as requested, then it
> >>>> is up
> >>>> to the operating system to balance
them among the available
> >>>> processors,
> >>>
> >>> This is correct. LAM simply starts up the
Right number of processes
> >>> and does not bind them to any particular
CPUs.
> >>>
> >>>> this is the essence of Symmetric Multi
Processing; AFAIK, there is no
> >>>> such a concept (and no need too) of
starting a process on a particular
> >>>> CPU in a plain SMP system.
> >>>>
> >>>> If you are using the Linux kernel,
then recent versions should have a
> >>>> tunable scheduler which tries to
attach processes to CPUs as much as
> >>>> possible (the so-called CPU affinity)
to improve performance on
> >>>> SMP, but
> >>>> it is not guaranteed either that a
given process will always run on
> >>>> the
> >>>> same CPU.
> >>>> If you have a NUMA (Non Uniform
Memory Access) system, then things
> >>>> are
> >>>> more complex, but I have no direct
experience of that.
> >>
> >> Most recent linux versions include a useful
taskset command:
> >> mpirun -np 8 taskset -c 8-15 ./a.out
> >> which should be fairly effective at placing
your processes on that group
> >> of processors within each node. The purpose
of using taskset usually is
> >> to improve efficiency through cache or NUMA
memory affinity, but it
> >> could be used to do what OP appears to be
requesting.
> >
> > Sorry, I must be missing something, but shouldn't
this be something the OS
> > does? I think I recall that last time I recompiled
a Linux kernel (a 2.6 one,
> > for AMD Opteron machine, about 6 months ago?)
there was stuff related to
> > NUMA. I'd feel better if someone doing kernel
development takes care of this
> > rather than having this responsibility myself .
>
> You're only missing that Computers Suck (IMHO) . There is
an awful lot
> of code in the Linux kernel to try to make NUMA
machines more tolerable.
> But it has it's limitations -- it's designed to provide
the best overall
> machine "responsiveness", not the lowest
latency for 2 of the 80 processes
> running on the machine. MPI apps tend to want the
second one -- a very
> few processes should be privledged over all others.
>
Brian, thanks for your comments. I was obviously missing
something
(the Computers Suck or a related concept, I guess ; maybe I
should
play around with taskset.
Best,
R.
>
> Brian
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
>
--
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |
  United States |
2007-02-05 14:13:23 |
On Mon, 5 Feb 2007, Ramon Diaz-Uriarte wrote:
> On 2/5/07, Brian W. Barrett <brbarret lam-mpi.org> wrote:
>> On Fri, 2 Feb 2007, Ramon Diaz-Uriarte wrote:
>>
>>> On Friday 02 February 2007 14:36, Tim Prince
wrote:
>>>> jsquyres cisco.com wrote:
>>>>> On Feb 2, 2007, at 6:02 AM, Davide
Cesari wrote:
>>>>>> If the second holds, then your
request is pointless, in my knowledge,
>>>>>> LAM does not make anything
particular to attach processes to CPUs in a
>>>>>> SMP system, it just starts as many
processes as requested, then it
>>>>>> is up
>>>>>> to the operating system to balance
them among the available
>>>>>> processors,
>>>>>
>>>>> This is correct. LAM simply starts up
the Right number of processes
>>>>> and does not bind them to any
particular CPUs.
>>>>>
>>>>>> this is the essence of Symmetric
Multi Processing; AFAIK, there is no
>>>>>> such a concept (and no need too) of
starting a process on a particular
>>>>>> CPU in a plain SMP system.
>>>>>>
>>>>>> If you are using the Linux
kernel, then recent versions should have a
>>>>>> tunable scheduler which tries to
attach processes to CPUs as much as
>>>>>> possible (the so-called CPU
affinity) to improve performance on
>>>>>> SMP, but
>>>>>> it is not guaranteed either that a
given process will always run on
>>>>>> the
>>>>>> same CPU.
>>>>>> If you have a NUMA (Non Uniform
Memory Access) system, then things
>>>>>> are
>>>>>> more complex, but I have no direct
experience of that.
>>>>
>>>> Most recent linux versions include a useful
taskset command:
>>>> mpirun -np 8 taskset -c 8-15 ./a.out
>>>> which should be fairly effective at placing
your processes on that group
>>>> of processors within each node. The
purpose of using taskset usually is
>>>> to improve efficiency through cache or NUMA
memory affinity, but it
>>>> could be used to do what OP appears to be
requesting.
>>>
>>> Sorry, I must be missing something, but
shouldn't this be something the OS
>>> does? I think I recall that last time I
recompiled a Linux kernel (a 2.6 one,
>>> for AMD Opteron machine, about 6 months ago?)
there was stuff related to
>>> NUMA. I'd feel better if someone doing kernel
development takes care of this
>>> rather than having this responsibility myself
.
>>
>> You're only missing that Computers Suck (IMHO) . There is
an awful lot
>> of code in the Linux kernel to try to make NUMA
machines more tolerable.
>> But it has it's limitations -- it's designed to
provide the best overall
>> machine "responsiveness", not the lowest
latency for 2 of the 80 processes
>> running on the machine. MPI apps tend to want the
second one -- a very
>> few processes should be privledged over all
others.
>>
>
> Brian, thanks for your comments. I was obviously
missing something
> (the Computers Suck or a related concept, I guess ; maybe I
should
> play around with taskset.
99.9% of the time, you won't even notice a problem with
recent kernels and
small Opteron machines (4 cores or so). But certain
worksets will cause
the kernel to do "dumb things" and as the number
of cores grows, the
kernel does a less brilliant job of keeping things under
control (at
least, that's what we've found on our quad socket, dual core
machines).
Brian
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: LAM-MPI task placement on an
SMP system |

|
2007-02-05 15:07:22 |
On 2/5/07, Brian W. Barrett <brbarret lam-mpi.org> wrote:
> On Mon, 5 Feb 2007, Ramon Diaz-Uriarte wrote:
>
> > On 2/5/07, Brian W. Barrett <brbarret lam-mpi.org> wrote:
> >> On Fri, 2 Feb 2007, Ramon Diaz-Uriarte wrote:
> >>
> >>> On Friday 02 February 2007 14:36, Tim
Prince wrote:
> >>>> jsquyres cisco.com wrote:
> >>>>> On Feb 2, 2007, at 6:02 AM, Davide
Cesari wrote:
> >>>>>> If the second holds, then your
request is pointless, in my knowledge,
> >>>>>> LAM does not make anything
particular to attach processes to CPUs in a
> >>>>>> SMP system, it just starts as
many processes as requested, then it
> >>>>>> is up
> >>>>>> to the operating system to
balance them among the available
> >>>>>> processors,
> >>>>>
> >>>>> This is correct. LAM simply
starts up the Right number of processes
> >>>>> and does not bind them to any
particular CPUs.
> >>>>>
> >>>>>> this is the essence of
Symmetric Multi Processing; AFAIK, there is no
> >>>>>> such a concept (and no need
too) of starting a process on a particular
> >>>>>> CPU in a plain SMP system.
> >>>>>>
> >>>>>> If you are using the Linux
kernel, then recent versions should have a
> >>>>>> tunable scheduler which tries
to attach processes to CPUs as much as
> >>>>>> possible (the so-called CPU
affinity) to improve performance on
> >>>>>> SMP, but
> >>>>>> it is not guaranteed either
that a given process will always run on
> >>>>>> the
> >>>>>> same CPU.
> >>>>>> If you have a NUMA (Non
Uniform Memory Access) system, then things
> >>>>>> are
> >>>>>> more complex, but I have no
direct experience of that.
> >>>>
> >>>> Most recent linux versions include a
useful taskset command:
> >>>> mpirun -np 8 taskset -c 8-15 ./a.out
> >>>> which should be fairly effective at
placing your processes on that group
> >>>> of processors within each node. The
purpose of using taskset usually is
> >>>> to improve efficiency through cache or
NUMA memory affinity, but it
> >>>> could be used to do what OP appears to
be requesting.
> >>>
> >>> Sorry, I must be missing something, but
shouldn't this be something the OS
> >>> does? I think I recall that last time I
recompiled a Linux kernel (a 2.6 one,
> >>> for AMD Opteron machine, about 6 months
ago?) there was stuff related to
> >>> NUMA. I'd feel better if someone doing
kernel development takes care of this
> >>> rather than having this responsibility
myself .
> >>
> >> You're only missing that Computers Suck (IMHO)
.
There is an awful lot
> >> of code in the Linux kernel to try to make
NUMA machines more tolerable.
> >> But it has it's limitations -- it's designed
to provide the best overall
> >> machine "responsiveness", not the
lowest latency for 2 of the 80 processes
> >> running on the machine. MPI apps tend to want
the second one -- a very
> >> few processes should be privledged over all
others.
> >>
> >
> > Brian, thanks for your comments. I was obviously
missing something
> > (the Computers Suck or a related concept, I guess
;
maybe I should
> > play around with taskset.
>
> 99.9% of the time, you won't even notice a problem with
recent kernels and
> small Opteron machines (4 cores or so). But certain
worksets will cause
> the kernel to do "dumb things" and as the
number of cores grows, the
> kernel does a less brilliant job of keeping things
under control (at
> least, that's what we've found on our quad socket, dual
core machines).
>
Thanks for the pointers. That is good to know. I understand
I would
also want to manually tweak settings if I am forced to
running many
more simultaneous MPI jobs than cores? (e.g., my
lamb-host.def says
"localhost cpu=20" in a dual core two cpu
machine?).
R.
> Brian
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
>
--
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
[1-8]
|
|