***SPAM*** Re: [ofa-general] ***SPAM*** troubleshooting with infinband
Hal Rosenstock
hal.rosenstock at gmail.com
Sat Feb 14 04:07:57 PST 2009
On Fri, Feb 13, 2009 at 10:34 PM, Vittorio <vitto.giova at yahoo.it> wrote:
> Hello!
> This is my first message on the list so i hope that i'm not going to ask
> silly or already answered question
>
> i'm a student and i'm porting an electromagnetic field simulator to a
> parallel and distributed linux cluster for final thesis; i'm using both
> OpenMP and MPI over Infiniband to achieve speed improvements
>
> the openmp part is done and now i'm facing problem with setting up MPI over
> Infinband
> i have correctly set up the kernel modules
> installed the right drivers for the board (mellanox hca) and userspace
> programs
> installed mpavich2 mpi implementation
>
> however i fail to run all of this together:
> for example ibhost correctly find the two nodes connected
>
> Ca : 0x0002c90300018b8e ports 2 " HCA-1"
> Ca : 0x0002c90300018b12 ports 2 "localhost HCA-1"
>
> but ibping doens't receive responses
>
> ibwarn: [32052] ibping: Ping..
> ibwarn: [32052] mad_rpc_rmpp: _do_madrpc failed; dport (Lid 2)
> ibwarn: [32052] main: ibping to Lid 2 failed
This would be expected if no ibping server was running on the lid 2 machine.
-- Hal
> subsequently any other operation with MPI fails
> strangely enough however IPoIB works very well and i can ping and connect
> with no problems
> the two machines are identical and they use a crossover cable (point to
> point)
> lspci identifies the boards as
> 03:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0
> 2.5GT/s] (rev a0)
>
> what can be the cause of all of this? am i forgetting something?
> any help is greatly appreciated
> Thank you
> Vittorio
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit
> http://openib.org/mailman/listinfo/openib-general
>
More information about the general
mailing list