Err.... I don't seem to have any of the typical IB diagnostics tools. All they [ie, i didn't originally install this machine] had running in the first place were the IB modules resident with the 2.6.9-42 redhat kernel.
<br><br>I'm debating on whether I should go install the CISCO IB Roll to make sure I have all the necessary tools in place. <br><br><div><span class="gmail_quote">On 8/22/07, <b class="gmail_sendername">Jeff Squyres</b>
<<a href="mailto:jsquyres@cisco.com">jsquyres@cisco.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">On Aug 22, 2007, at 5:32 PM, John Leidel wrote:
<br><br>> Yeah, I'm sure the OFED release is the same. I'm running ROCKS<br>> 4.2.1, so all the node images are identical regarding the package<br>> selection. There could possibly be [well, probably] a difference
<br>> in the firmware releases of the HCAs and switches from the older<br>> machines and the latest delivery.<br><br>Try running ibv_devinfo on your new nodes, which should show you the<br>HCA(s) on your host. I suspect that it will fail with a similar
<br>error (but am not 100% sure -- I'm the MPI guy, not the verbs guy :-) ).<br><br>If this is the case, then you've got a bigger issue that your IB<br>drivers are not loading. This will need to be fixed before you
<br>investigate the firmware level on your HCAs across the cluster.<br><br>(other IB stack experts feel free to chime in...)<br><br>--<br>Jeff Squyres<br>Cisco Systems<br><br></blockquote></div><br>