List Info

Thread: LAM: LamBoot Error




LAM: LamBoot Error
country flaguser name
United States
2008-02-21 03:04:33
Sunny Bhatheja wrote:
> Sir,
>
> I am using Beowulf Cluster . i configured it and
install LAM. But when i
> try to run the lamboot command it gives me followoing
error.
>
>
>
>
>
>
************************************************************
******************
> bash: hboot: command not found
>
------------------------------------------------------------
-----------------
> LAM failed to execute a LAM binary on the remote node
"beowulf01".
> Since LAM was already able to determine your remote
shell as "hboot",
> it is probable that this is not an authentication
problem.
>
> *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS
SUGGESTIONS, AND
> *** CONSULT THE "BOOTING LAM" SECTION OF THE
LAM/MPI FAQ
> *** (http://www.lam-mpi.org/fa
q/) BEFORE POSTING TO THE LAM/MPI USER'S
> *** MAILING LIST.
>
> LAM tried to use the remote agent command
"ssh"
> to invoke the following command:
>
>         ssh -x beowulf01 -n hboot -t -c lam-conf.lamd
-s -I '"-H
> 192.192.192.56 -P 36836 -n 1 -o 0"'
>
> This can indicate several things.  You should check the
following:
>
>         - The LAM binaries are in your $PATH
>         - You can run the LAM binaries
>         - The $PATH variable is set properly before
your
>           .cshrc/.profile exits
>
> Try to invoke the command listed above manually at a
Unix prompt.
>
> You will need to configure your local setup such that
you will *not*
> be prompted for a password to invoke this command on
the remote node.
> No output should be printed from the remote node before
the output of
> the command is displayed.
>
> When you can get this command to execute successfully
by hand, LAM
> will probably be able to function properly.
>
------------------------------------------------------------
-----------------
> ERROR: LAM/MPI unexpectedly received the following on
stderr:
> bash: tkill: command not found
>
------------------------------------------------------------
-----------------
> LAM failed to execute a LAM binary on the remote node
"beowulf01".
> Since LAM was already able to determine your remote
shell as "tkill",
> it is probable that this is not an authentication
problem.
>
> *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS
SUGGESTIONS, AND
> *** CONSULT THE "BOOTING LAM" SECTION OF THE
LAM/MPI FAQ
> *** (http://www.lam-mpi.org/fa
q/) BEFORE POSTING TO THE LAM/MPI USER'S
> *** MAILING LIST.
>
> LAM tried to use the remote agent command
"ssh"
> to invoke the following command:
>
>         ssh -x beowulf01 -n tkill
>
> This can indicate several things.  You should check the
following:
>
>         - The LAM binaries are in your $PATH
>         - You can run the LAM binaries
>         - The $PATH variable is set properly before
your
>           .cshrc/.profile exits
>
> Try to invoke the command listed above manually at a
Unix prompt.
>
> You will need to configure your local setup such that
you will *not*
> be prompted for a password to invoke this command on
the remote node.
> No output should be printed from the remote node before
the output of
> the command is displayed.
>
> When you can get this command to execute successfully
by hand, LAM
> will probably be able to function properly.
>
------------------------------------------------------------
-----------------
> [wolfbeowulf00 etc]$
>
>
>
>
>
>
************************************************************
******************
>
>
> --
> ....................................
> Warm Regards
> Sunny Bhatheja
> Linux Engineer
> +91-9911849409
>
> Tetra Information Services Pvt. Ltd.
>  136, Lower Ground Floor, Sant Nagar.
>  East of Kailash.
>  New Delhi - 110065
>
>  Phone : +91-11-66604033 34 35
>  Fax   : +91-11-41620171
>
>
>  Website : www.tetrain.com, www.linux4e.com
>
>
>  "We Create and Manage Comprehensive Technology
Solutions Scalable to your
> needs"
>
>
>


-- 
....................................
Warm Regards
Sunny Bhatheja
Linux Engineer
+91-9911849409

Tetra Information Services Pvt. Ltd.
 136, Lower Ground Floor, Sant Nagar.
 East of Kailash.
 New Delhi - 110065

 Phone : +91-11-66604033 34 35
 Fax   : +91-11-41620171


 Website : www.tetrain.com, www.linux4e.com


 "We Create and Manage Comprehensive Technology
Solutions Scalable to your
needs"


_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

Re: LAM: LamBoot Error
user name
2008-02-21 07:31:17
On Feb 21, 2008, at 1:04 AM, Sunny Bhatheja wrote:

>> *** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS
SUGGESTIONS, AND
>> *** CONSULT THE "BOOTING LAM" SECTION OF
THE LAM/MPI FAQ
>> *** (http://www.lam-mpi.org/fa
q/) BEFORE POSTING TO THE LAM/MPI  
>> USER'S
>> *** MAILING LIST.

This is probably the most important part of the help
message.  

I think the information you need is in the rest of the help
message,  
the FAQ, and the User's Guide.

(sorry to be snarky; it's a little frustrating when one
takes the time  
to meticulously document software and then have users
clearly not read  
any of it)

-- 
Jeff Squyres
Cisco Systems

_______________________________________________
This list is archived at http://www.l
am-mpi.org/MailArchives/lam/

LAM: Lamboot Error
country flaguser name
Russian Federation
2008-03-06 09:13:30
Dear members,
    ;     I have problem in lamboot. I also found this topic on the FAQs page. I have tried possible solutions but still the error. When booting lam-mpi on openSUSE 10.3, I got the following error messages:

zayarHPC-3:~>;lamboot -v bhost
LAM 7.1.4/MPI 2 C++/ROMIO - Indiana University

n-1<25538> ssi:boot:base:linear: booting n0 (HPC-3)
n-1<25538&gt; ssi:boot:base:linear: booting n1 (HPC-2)
-----------------------------------------------------------------------------
The lamboot agent timed out while waiting for the newly-booted process
to call back and indicated that it had successfully booted.

*** PLEASE READ THIS ENTIRE MESSAGE, FOLLOW ITS SUGGESTIONS, AND
*** CONSULT THE "BOOTING LAM" SECTION OF THE LAM/MPI FAQ
*** (http://www.lam-mpi.org/faq/) BEFORE POSTING TO THE LAM/MPI USER'S
*** MAILING LIST.

As far as LAM could tell, the remote process started properly, but
then never called back. ; Possible reasons that this may happen:

&nbsp;   ; &nbsp;  - There are network filters between the lamboot agent host and
   ; &nbsp; &nbsp; &nbsp; the remote host such that communication on random TCP ports
&nbsp; &nbsp; &nbsp; &nbsp;   is blocked
&nbsp; &nbsp;   ;  - Network routing from the remote host to the local host isn't
&nbsp; &nbsp; &nbsp; &nbsp;   properly configured (this is uncommon)

You can check these things by watching the output from "lamboot -d".

1. On the command line for hboot, there are two important parameters:
 &nbsp; one is the IP address of where the lamboot agent was invoked, the
   other is the port number that the lamboot agent is expecting the
   newly-booted process to call back on (this will be a random
&nbsp;  integer).

2. Manually login to the remote machine and try to telnet to the port
 ;  indicated on the hboot command line. ; For example,
&nbsp;   ; &nbsp; telnet <ipnumber> <portnumber>
&nbsp;  If all goes well, you should get a "Connection refused" error.&nbsp; If
 &nbsp; you get any other kind of error, it could indicate either of the
   two conditions above.&nbsp; Consult with your system/network
   administrator.
-----------------------------------------------------------------------------
n-1&lt;25538> ssi:boot:base:linear: aborted!
n-1<25544> ssi:boot:base:linear: booting n0 (HPC-3)
n-1<25544&gt; ssi:boot:base:linear: booting n1 (HPC-2)
n-1<25544&gt; ssi:boot:base:linear: finished
lamboot did NOT complete successfully
zayarHPC-3:~>; telnet (my-remote-ip) 23451
Trying (my-remote-ip)...
telnet: connect to address (my-remote-ip): Connection refused
zayarHPC-3:~>; telnet 127.0.0.1 32154
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
zayarHPC-3:~>; ssh -x hpc-2 hostname
HPC-2
zayarHPC-3:~>;
Please advise me.
Thanks.


Looking for last minute shopping deals? Find them fast with Yahoo! Search.
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )