Hello LAM/MPIers:
I have compiled the Gromacs MD program for both PPC and Intel, I can run each binary on their corresponding architectures, but when I try to do an mpirun across the two different machines, the job fails after a few seconds. Each machine can see and execute its architecture specific binary, the LAM/MPI is the universal build of LAM/MPI 7.1.2 from the website. Any ideas on what I might be doing wrong? Below is the only message that I'm getting:
calculon$ mpirun -np 4 mdrun NNODES=4, MYRANK=1, HOSTNAME=Warner-Computer.local NNODES=4, MYRANK=3, HOSTNAME=Warner-Computer.local NNODES=4, MYRANK=0, HOSTNAME=portal.private NNODES=4, MYRANK=2, HOSTNAME=portal.private NODEID=2 argc=1 NODEID=0 argc=1 NODEID=1 argc=16777216 NODEID=3 argc=16777216 MPI_Recv: message truncated (rank 2, MPI_COMM_WORLD) Rank (2, MPI_COMM_WORLD): Call stack within LAM: Rank (2, MPI_COMM_WORLD): - MPI_Recv() Rank (2, MPI_COMM_WORLD): - main() ----------------------------------------------------------------------------- One of the processes started by mpirun has exited with a nonzero exit code. This typically indicates that the process finished in error. If your process did not finish in error, be sure to include a "return 0" or "exit(0)" in your C code before exiting the application.
PID 341 failed on node n0 (10.0.1.1) with exit status 1. ----------------------------------------------------------------------------- calculon$
Warner Yuen Research Computing Consultant Apple Computer email: apple.com">wyuen apple.com Tel: 408.718.2859 Fax: 408.715.0133 |