|
List Info
Thread: LAM: Maximum number of processes per node???
|
|
| LAM: Maximum number of processes per
node??? |
  United States |
2007-10-08 12:57:39 |
|
I have a need to run around 100 processes per node on a
small Linux cluster running LAM 7.1 I can only seem to run about 40 –
after that it just fails. Although I have not seen an error about too many
file descriptors, I did increase this limit by a factor of 8 and no luck.
Does LAM have a built in limit? Can it be overridden?
Ben Held
Simulation
Technology & Applied Research, Inc.
11520 N. Port Washington Rd.,
Suite 201
Mequon, WI
53092
P: 1.262.240.0291 x101
F: 1.262.240.0294
E: ben.held staarinc.com">ben.held staarinc.com
http://www.staarinc.com
|
| Re: LAM: Maximum number of processes per
node??? |
  United States |
2007-10-08 13:06:31 |
On Oct 8, 2007, at 11:57 AM, Ben Held wrote:
> I have a need to run around 100 processes per node on a
small Linux
> cluster running LAM 7.1 I can only seem to run about
40 – after
> that it just fails. Although I have not seen an error
about too
> many file descriptors, I did increase this limit by a
factor of 8
> and no luck.
>
> Does LAM have a built in limit? Can it be overridden?
>
Yes, there is a built-in limit. We have not experimented
with making
it larger, and the only way to override it is to change some
magic
defines I can't recall at this time. So overriding the
limit is both
unsupported and not recommended.
Good luck,
Brian
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: Maximum number of processes per
node??? |
  United States |
2007-10-08 13:15:37 |
Brian Barrett wrote:
> On Oct 8, 2007, at 11:57 AM, Ben Held wrote:
>
>> I have a need to run around 100 processes per node
on a small Linux
>> cluster running LAM 7.1 I can only seem to run
about 40 – after
>> that it just fails. Although I have not seen an
error about too
>> many file descriptors, I did increase this limit by
a factor of 8
>> and no luck.
>>
>> Does LAM have a built in limit? Can it be
overridden?
>>
>>
>
> Yes, there is a built-in limit. We have not
experimented with making
> it larger, and the only way to override it is to change
some magic
> defines I can't recall at this time. So overriding the
limit is both
> unsupported and not recommended.
>
>
> Good luck,
>
> Brian
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
>
IIRC, the limit is in the lamd's management of its children.
If so,
might it be possible to "trick" lamboot into
starting multiple lamd's
per node to circumvent this?
-Paul
--
Paul H. Hargrove PHHargrove lbl.gov
Future Technologies Group
HPC Research Department Tel:
+1-510-495-2352
Lawrence Berkeley National Laboratory Fax:
+1-510-486-6900
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: Maximum number of processes per
node??? |
  United States |
2007-10-08 13:23:51 |
Brian,
What is the limit? I have a problem that only fails when
run on 1024
processes, but I need to reproduce it on a system with only
12 nodes...
Thanks,
Ben
-----Original Message-----
From: lam-bounces lam-mpi.org [mailto:lam-bounces lam-mpi.org] On Behalf Of
Paul H. Hargrove
Sent: Monday, October 08, 2007 1:16 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: Maximum number of processes per node???
Brian Barrett wrote:
> On Oct 8, 2007, at 11:57 AM, Ben Held wrote:
>
>> I have a need to run around 100 processes per node
on a small Linux
>> cluster running LAM 7.1 I can only seem to run
about 40 - after
>> that it just fails. Although I have not seen an
error about too
>> many file descriptors, I did increase this limit by
a factor of 8
>> and no luck.
>>
>> Does LAM have a built in limit? Can it be
overridden?
>>
>>
>
> Yes, there is a built-in limit. We have not
experimented with making
> it larger, and the only way to override it is to change
some magic
> defines I can't recall at this time. So overriding the
limit is both
> unsupported and not recommended.
>
>
> Good luck,
>
> Brian
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
>
IIRC, the limit is in the lamd's management of its children.
If so,
might it be possible to "trick" lamboot into
starting multiple lamd's
per node to circumvent this?
-Paul
--
Paul H. Hargrove PHHargrove lbl.gov
Future Technologies Group
HPC Research Department Tel:
+1-510-495-2352
Lawrence Berkeley National Laboratory Fax:
+1-510-486-6900
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: Maximum number of processes per
node??? |
  Poland |
2007-10-08 14:35:41 |
Hi Ben,
I had similar problem on Suse 10.1 and Fedora 4. On suse
10.2 the
problem disappear.
But you can overcome this.
1) boot you lam with "lamboot -v lamhost" where in
the lamhost you have
all nodes required
for example: if you need to simulate 100 nodes on one
physical cpu then
in lamhost you would have: localhost cpu=100
2) run your code with comand:
mpirun -ssi rpi crtcp C ./executabe
In may case with Lam MPI 7.1.2 it was OK. Let me know if it
works also
in your case.
Artur.
University of Czestochowa
Poland
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: Maximum number of processes per
node??? |
  United States |
2007-10-08 14:36:29 |
Artur,
Thanks - that worked. However, I need to run across several
nodes (each
node with 100 processes). Any thoughts? Is this a OS issue
- I am running
Fedora Core 5 64-bit.
Ben
-----Original Message-----
From: lam-bounces lam-mpi.org [mailto:lam-bounces lam-mpi.org] On Behalf Of
Artur Tyliszczak
Sent: Monday, October 08, 2007 2:36 PM
To: General LAM/MPI mailing list
Subject: Re: LAM: Maximum number of processes per node???
Hi Ben,
I had similar problem on Suse 10.1 and Fedora 4. On suse
10.2 the
problem disappear.
But you can overcome this.
1) boot you lam with "lamboot -v lamhost" where in
the lamhost you have
all nodes required
for example: if you need to simulate 100 nodes on one
physical cpu then
in lamhost you would have: localhost cpu=100
2) run your code with comand:
mpirun -ssi rpi crtcp C ./executabe
In may case with Lam MPI 7.1.2 it was OK. Let me know if it
works also
in your case.
Artur.
University of Czestochowa
Poland
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
[1-6]
|
|
|
about | contact Other archives ( Real Estate discussion Medical topics )
|