List Info

Thread: LAM: Data structure of bookmark exchanged among processes before a checkpoint in LAM/MPI




LAM: Data structure of bookmark exchanged among processes before a checkpoint in LAM/MPI
country flaguser name
France
2008-03-14 04:12:02
Dear members,

I'd like to mesure the synchronisation time for the
checkpoint of an MPI job. To
do so, I'd like to know the data structure of the bookmark
exchanged among the
job's processes before they are individually checkpointed.
I'd also like to know if the bookmark's size is fix what
ever the size of the
job and it's number of processes.

Thanks.


-------------------------------------------------
envoyé via Webmail/IMAG !

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: Data structure of bookmark exchanged among processes before a checkpoint in LAM/MPI
country flaguser name
United States
2008-03-17 12:04:59
On Fri, 14 Mar 2008, Blaise-Omer.Yenkeimag.fr wrote:

> I'd like to mesure the synchronisation time for the
checkpoint of an MPI 
> job. To do so, I'd like to know the data structure of
the bookmark 
> exchanged among the job's processes before they are
individually 
> checkpointed. I'd also like to know if the bookmark's
size is fix what 
> ever the size of the job and it's number of processes.

The data structure sent between peers for bookmarking is a
simple 
structure containing a couple of fields (bytes / messages in
flight, 
mainly).  It's less than 32 bytes, so it's pretty small.  I
don't remember 
the exact details, but the code is available on our web
page.  The 
bookmark structure's size is fixed -- it does not scale with
number of 
processes in the job.  However, the total amount of data
sent to setup a 
checkpoint does scale with the number of nodes (as there are
more 
instances of the bookmark structure being sent around).

Hope this helps,

Brian

-- 
   Brian Barrett
   LAM/MPI Developer
   Make today a LAM/MPI day!
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: Data structure of bookmark exchanged among processes before a checkpoint in LAM/MPI
country flaguser name
France
2008-03-18 03:33:38
Quoting "Brian W. Barrett" <brbarretlam-mpi.org>:

Thank's Brian, It is exactly what I wanted to know.

Regards.

> On Fri, 14 Mar 2008, Blaise-Omer.Yenkeimag.fr
wrote:
> 
> > I'd like to mesure the synchronisation time for
the checkpoint of an MPI 
> > job. To do so, I'd like to know the data structure
of the bookmark 
> > exchanged among the job's processes before they
are individually 
> > checkpointed. I'd also like to know if the
bookmark's size is fix what 
> > ever the size of the job and it's number of
processes.
> 
> The data structure sent between peers for bookmarking
is a simple 
> structure containing a couple of fields (bytes /
messages in flight, 
> mainly).  It's less than 32 bytes, so it's pretty
small.  I don't remember 
> the exact details, but the code is available on our web
page.  The 
> bookmark structure's size is fixed -- it does not scale
with number of 
> processes in the job.  However, the total amount of
data sent to setup a 
> checkpoint does scale with the number of nodes (as
there are more 
> instances of the bookmark structure being sent
around).
> 
> Hope this helps,
> 
> Brian
> 
> -- 
>    Brian Barrett
>    LAM/MPI Developer
>    Make today a LAM/MPI day!
> _______________________________________________
> This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
> 




-------------------------------------------------
envoyé via Webmail/IMAG !


_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )