List Info

Thread: LAM: lam Digest, Vol 813, Issue 1




LAM: lam Digest, Vol 813, Issue 1
user name
2006-11-22 03:38:46
Hi Hector,
I think your program is not written correctly.
First, you should allocate a memory to message (for
example : char message[12]..., or using malloc). 
Second, initializing value to message should be placed
in private code of node 0 (sending node) : 
if (rank == 0)
    {
      strcpy(message, "Hello world !");
      for (i = 1; i < size; i++)
        {
          MPI_Send (message, 12, MPI_CHAR, i, tag,
MPI_COMM_WORLD);
        }
    }

Hope this helps | 
--- lam-requestlam-mpi.org wrote:

> Send lam mailing list submissions to
> 	lamlam-mpi.org
> 
> To subscribe or unsubscribe via the World Wide Web,
> visit
> 	http:
//www.lam-mpi.org/mailman/listinfo.cgi/lam
> or, via email, send a message with subject or body
> 'help' to
> 	lam-requestlam-mpi.org
> 
> You can reach the person managing the list at
> 	lam-ownerlam-mpi.org
> 
> When replying, please edit your Subject line so it
> is more specific
> than "Re: Contents of lam digest..."
> > Today's Topics:
> 
>    1. Re: Unable to boot Lam in a remote machine
> (460853unizar.es)
> > From: 460853unizar.es
> To: lamlam-mpi.org
> Date: Mon, 20 Nov 2006 18:24:24 +0100
> Subject: Re: LAM: Unable to boot Lam in a remote
> machine
> 
> Hello everyone
> 
> Well, at first, thank you for answering. I'd also
> like to apologize for not
> having been able to write earlier, but some family
> dutys kept me out of all
> this for a while.
> 
> Next, I'd like to say that the trouble I asked about
> in my previous mail has
> been solved by disabling the Firewall so, certainly,
> that was the problem. The
> thing is that now, I'm having another trouble.
> 
> After disabling the firewall, and managing to set
> the environemnt up, I looked
> in the Internet for a very simple program (actually,
> a "Hello World") 
> done with
> MPI:
> 
> 
> ---------------------prueba.c ------------------
> /* C Example */
> #include <stdio.h>
> #include <mpi.h>
> #include <math.h>
> 
> 
> void
> main (argc, argv)
>      int argc;
>      char *argv[];
> {
>   char *message = "Hello world";
>   int rank, size, i, tag, node;
>   MPI_Status status;
> 
>   MPI_Init (&argc, &argv);      /* starts MPI
*/
>   MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /*
> get current process id */
>   MPI_Comm_size (MPI_COMM_WORLD, &size);        /*
> get number of processes */
>   tag = 100;
> 
>   if (rank == 0)
>     {
>       for (i = 1; i < size; i++)
>         {
>           MPI_Send (message, 12, MPI_CHAR, i, tag,
> MPI_COMM_WORLD);
>         }
>     }
>   else
>     {
>       MPI_Recv (message, 12, MPI_CHAR, 0, tag,
> MPI_COMM_WORLD, &status);
>     }
> 
>   printf ("node:%d  %sn", rank, message);
>   MPI_Finalize ();
> }
> --------------------------------------------
> 
> I compile it with: mpicc -o prueba.exe prueba.c
> (It's a Linux system, so I know that this of the
> .exe is unnecessary, but
> anyway... I did it this way in order to know which
> the executable file is).
> Then I place a copy of that executable in a folder
> which is in the Path 
> in both
> computers (preciseness in $HOME/bin/)
> 
> Next, I start the environment properly (ehm...
> properly "I guess")
> ---------------------------------------------
> hectorrdp13:~/Pa aprendé/Pruebas MPI> lamboot -v
> lamhosts
> 
> LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University
> 
> n-1<26498> ssi:boot:base:linear: booting n0
> (155.210.155.67)
> n-1<26498> ssi:boot:base:linear: booting n1
> (155.210.155.70)
> n-1<26498> ssi:boot:base:linear: finished
> ----------------------------------------------
> 
> But when I try to execute with mpirun, I get the
> following output:
> ---------------------------------------------
> hectorrdp13:~/bin> mpirun -v -np 2 prueba.exe
> 26535 prueba.exe running on n0 (o)
> 4861 prueba.exe running on n1
> node:0  Hello world
> MPI_Recv: process in local group is dead (rank 1,
> MPI_COMM_WORLD)
> Rank (1, MPI_COMM_WORLD): Call stack within LAM:
> Rank (1, MPI_COMM_WORLD):  - MPI_Recv()
> Rank (1, MPI_COMM_WORLD):  - main()
> ---------------------------------------------
> 
> It seems that node 1 (the remote node) is not
> working. It says it's "dead". I
> looked for this error message in Google, and I
> understood that what is
> happenning is that the process is not running in the
> remote machine. It was
> also said that this can happen because the
> MPI_Finalize (); instruction was
> executed too soon. I think in this case, that can't
> be it, because is an
> absolutely simple program that has been downloaded
> from an example web 
> page, so
> I guess it should work.
> 
> I would also like to say that in the remote machine,
> after setting up the
> enviroment with the lamboot command, a "ps
aux"
> shows (among many other 
> things)
> a lamd daemon running
> 
> -----------------------------------
> hectorvenus2:~/bin> ps aux
> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT
> START   TIME COMMAND
> root         1  0.0  0.0    776   304 ?        S   
> 17:24   0:00 init [5]
> root         2  0.0  0.0      0     0 ?        SN  
> 17:24   0:00 [ksoftirqd/0]
> [. . .]
> hector    3743  0.0  0.0   6484  1148 ?        S   
> 17:26   0:00
> /usr/bin/lamd -
> -----------------------------------
> 
> So the environement seems to be raised properly...
> The thing is that it 
> doesn't
> execute the program properly.
> 
> I imagine that the solution will be quite simple,
> but I can't see it :(
> 
> Thank you very much in advance!!
> //Hector
> 
> >> 460853unizar.es wrote:
> >>> I know there's a firewall in each machine
that
> only opens the SSH 
> >>> (22) port, so
> >>> I guess the problem comes from that. So,
what
> ports do I have to 
> >>> open in order
> >>> to boot LAM?.
> >>>
> >>> Executing the lamboot with the -d option,
I've
> read (among many 
> >>> other things)
> >>> this:
> >>>
> >>>    lamd -H 155.210.155.67 -P 6459 -n 1 -o
0 -d
> >>>
> >>> So, I guess that this means that the
.155.70
> machine should be able 
> >>> to reach the
> >>> port 6459 in the .155.67 machine. Am I
right? So
> the solution comes 
> >>> by opening
> >>> the 6459 port in the .155.67 machine?
Should I
> open this port also in the
> >>> .155.70 machine? Otherwise, which ports
should I
> open? Because I 
> >>> don't know if
> >>> it will be enough with opening only these
ports.
> >>
> >> All non-system (> 1024) TCP ports are
needed to
> boot and run LAM.  In
> >> more detail - LAM does not use any specific
port
> numbers, but instead
> >> requests any random open port from the OS. 
Check
> out FAQs 17 and 18
> >> here for some more info:
> >>
> >> http://www.
lam-mpi.org/faq/category4.php3
> >>
> >> Hope this helps!
> >>
> >> Andrew
> 
> 
> 
> 
> 
> 



 
____________________________________________________________
________________________
Sponsored Link

$200,000 mortgage for $660/ mo
30/15 yr fixed, reduce debt
http://yahoo.ratemar
ketplace.com
_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )