List Info

Thread: LAM: query regarding lam/mpi cluster




LAM: query regarding lam/mpi cluster
user name
2008-03-19 08:16:24

 
Dear Sir

I am doing Post Graduation in Computer Science. My project topic is Grid computing. I have built up a Lam/Mpi cluster in our college lab which allows parallel execution of job. Please answer these queries:

1. Are there any standard ready-made applications which can be run on this cluster for demonstration purpose?
2. Is there any mechnism which allows to store process status(details) executing on one node of cluster to other node? this is needed in case of  node failure.
3. How to take snapshot of a process state?

With Regards,

G. M. Dhopavkar



>Maruti
Re: LAM: query regarding lam/mpi cluster
country flaguser name
United States
2008-03-28 22:05:18
On Mar 19, 2008, at 7:16 AM, gauri dhopavkar wrote:
> I am doing Post Graduation in Computer Science. My
project topic is  
> Grid computing. I have built up a Lam/Mpi cluster in
our college lab  
> which allows parallel execution of job. Please answer
these queries:
>
> 1. Are there any standard ready-made applications which
can be run  
> on this cluster for demonstration purpose?
>

There are many...  Have a look in the examples/ directory of
the LAM/ 
MPI tarball for simple ones, or a quick google search should
find a  
good set of MPI applications.

> 2. Is there any mechnism which allows to store process 

> status(details) executing on one node of cluster to
other node? this  
> is needed in case of  node failure.
>
This is not part of the MPI standard.  Generally
applications use  
custom checkpointing mechanisms or system level
checkpointing for  
handling node failures.  LAM/MPI supports integration with
the BLCR  
system level checkpointer on Linux systems.  Have a look at
our paper  
on the subject for more details:

   http://www.lam-mpi.org/papers/lacsi2003/lacsi-2003.pdf

> 3. How to take snapshot of a process state?
>

This is a difficult task.  I'd recommend the above paper for
more  
details on system level checkpointing.  If you search on ACM
or IEEE's  
database, I'm sure you'll find a number of papers on system
and  
application level checkpointing for MPI.  It's a complex
topic with  
lots of tradeoffs.


Hope this helps,

Brian

-- 
   Brian Barrett
   LAM/MPI Developer
   Make today a LAM/MPI day!


_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )