[ofa-general] MPI IB Errors

John Leidel john.leidel at gmail.com
Wed Aug 22 14:45:36 PDT 2007


Err.... I don't seem to have any of the typical IB diagnostics tools.  All
they [ie, i didn't originally install this machine] had running in the first
place were the IB modules resident with the 2.6.9-42 redhat kernel.

I'm debating on whether I should go install the CISCO IB Roll to make sure I
have all the necessary tools in place.

On 8/22/07, Jeff Squyres <jsquyres at cisco.com> wrote:
>
> On Aug 22, 2007, at 5:32 PM, John Leidel wrote:
>
> > Yeah, I'm sure the OFED release is the same.  I'm running ROCKS
> > 4.2.1, so all the node images are identical regarding the package
> > selection.  There could possibly be [well, probably] a difference
> > in the firmware releases of the HCAs and switches from the older
> > machines and the latest delivery.
>
> Try running ibv_devinfo on your new nodes, which should show you the
> HCA(s) on your host.  I suspect that it will fail with a similar
> error (but am not 100% sure -- I'm the MPI guy, not the verbs guy :-) ).
>
> If this is the case, then you've got a bigger issue that your IB
> drivers are not loading.  This will need to be fixed before you
> investigate the firmware level on your HCAs across the cluster.
>
> (other IB stack experts feel free to chime in...)
>
> --
> Jeff Squyres
> Cisco Systems
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20070822/ac20fdf9/attachment.html>


More information about the general mailing list