|
List Info
Thread: LAM: caused collective abort of all ranks
|
|
| Re: LAM: caused collective abort of all
ranks |
  Germany |
2008-02-14 14:07:49 |
On Thu, 14 Feb 2008, fahad saeed wrote:
> node1 may run --------> ./binary -in file1 -out
file1-output
> node2 may run --------> ./binary -in file2 -out
file2-output
This is very much not MPI, has nothing to do with message
passing.
You have to look for a batch/queueing system like Torque,
SGE, SLURM,
etc. Some of them (SGE for some time, Torque in development)
have
support for running job arrays, concept which fits very well
with your
description above (same job with different inputs and
outputs). But
even if you don't use job arrays, they offer a lot more
options to let
you decide where and what to run.
--
Bogdan Costescu
IWR, University of Heidelberg, INF 368, D-69120 Heidelberg,
Germany
Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850
E-mail: bogdan.costescu iwr.uni-heidelberg.de
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
|
|
| Re: LAM: caused collective abort of all
ranks |
  United States |
2008-02-14 14:12:43 |
|
Thanks alot.
> Date: Thu, 14 Feb 2008 21:07:49 +0100 > From: Bogdan.Costescu iwr.uni-heidelberg.de > To: lam lam-mpi.org > Subject: Re: LAM: caused collective abort of all ranks > > On Thu, 14 Feb 2008, fahad saeed wrote: > > > node1 may run --------> ./binary -in file1 -out file1-output > > node2 may run --------> ./binary -in file2 -out file2-output > > This is very much not MPI, has nothing to do with message passing. > > You have to look for a batch/queueing system like Torque, SGE, SLURM, > etc. Some of them (SGE for some time, Torque in development) have > support for running job arrays, concept which fits very well with your > description above (same job with different inputs and outputs). But > even if you don't use job arrays, they offer a lot more options to let > you decide where and what to run. > > -- > Bogdan Costescu > > IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany > Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 > E-mail: bogdan.costescu iwr.uni-heidelberg.de > _______________________________________________ > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
Helping your favorite cause is as easy as instant messaging. You IM, we give. Learn more. |
| Re: LAM: caused collective abort of all
ranks |
  United States |
2008-02-14 14:27:13 |
|
Thanks alot.
> Date: Thu, 14 Feb 2008 21:07:49 +0100 > From: Bogdan.Costescu iwr.uni-heidelberg.de > To: lam lam-mpi.org > Subject: Re: LAM: caused collective abort of all ranks > > On Thu, 14 Feb 2008, fahad saeed wrote: > > > node1 may run --------> ./binary -in file1 -out file1-output > > node2 may run --------> ./binary -in file2 -out file2-output > > This is very much not MPI, has nothing to do with message passing. > > You have to look for a batch/queueing system like Torque, SGE, SLURM, > etc. Some of them (SGE for some time, Torque in development) have > support for running job arrays, concept which fits very well with your > description above (same job with different inputs and outputs). But > even if you don't use job arrays, they offer a lot more options to let > you decide where and what to run. > > -- > Bogdan Costescu > > IWR, University of Heidelberg, INF 368, D-69120 Heidelberg, Germany > Phone: +49 6221 54 8869/8240, Fax: +49 6221 54 8868/8850 > E-mail: bogdan.costescu iwr.uni-heidelberg.de > _______________________________________________ > This list is archived at http://www.lam-mpi.org/MailArchives/lam/
Helping your favorite cause is as easy as instant messaging. You IM, we give. Learn more. |
|
|
|
about | contact Other archives ( Real Estate discussion Medical topics )
|