[openib-general] opensm errors with ehca
Brad Benton
brad.benton at us.ibm.com
Tue Nov 1 14:27:32 PST 2005
Troy Benjegerdes wrote on 10/30/2005 05:55:04 PM:
> The firmware on the IBM eHCA causes opensm to spit out these kinds of
> errors all the time..
>
> Is there a way we can either not send P_KeyTable requests to any eHCA
> guids, or figure out what (if anything) is broken in their firmware?
>
> Is this a spec violation, or just ambiguities in implementation?
...
> Oct 30 17:49:46 053861 [43005960] -> SMP dump:
> base_ver................0x1
> mgmt_class..............0x81
> class_ver...............0x1
> method..................0x1 (SubnGet)
> D bit...................0x0
> status..................0x0
> hop_ptr.................0x0
> hop_count...............0x2
> trans_id................0x158c
> attr_id.................0x16
(P_KeyTable)
> resv....................0x0
> attr_mod................0x260000
Here is what is happening: The attribute modifier for the P_KeyTable
attribute is divided into two, 16-bit halves. The most significant 16
bits
is information that is only valid for switches. The problem here is that
this SubnGet is for an HCA. The firmware currently sees that the upper
bits are non-zero and since it is not a switch, throws the packet away.
The proper response would be for it to ignore the upper bits and process
the MAD. However, this is in firmware that won't be able to be changed
quickly. So, in the meantime as a work around, would it be possible to
have the opensm clear out the upper 16 bits of the attribute modifier when
making a P_KeyTable request of an HCA?
Thanks,
--Brad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20051101/ff52514c/attachment.html>
More information about the general
mailing list