[openib-general] opensm errors with ehca

Hal Rosenstock halr at voltaire.com
Tue Nov 1 14:44:43 PST 2005


Hi Brad,

On Tue, 2005-11-01 at 17:27, Brad Benton wrote:
> 
> Troy Benjegerdes wrote on 10/30/2005 05:55:04 PM:
> 
> > The firmware on the IBM eHCA causes opensm to spit out these kinds
> of
> > errors all the time..
> > 
> > Is there a way we can either not send P_KeyTable requests to any
> eHCA
> > guids, or figure out what (if anything) is broken in their firmware?
> > 
> > Is this a spec violation, or just ambiguities in implementation?
> ...
> > Oct 30 17:49:46 053861 [43005960] -> SMP dump:
> >                                 base_ver................0x1
> >                                 mgmt_class..............0x81
> >                                 class_ver...............0x1
> >                                 method..................0x1
> (SubnGet)
> >                                 D bit...................0x0
> >                                 status..................0x0
> >                                 hop_ptr.................0x0
> >                                 hop_count...............0x2
> >                                 trans_id................0x158c
> >                                 attr_id.................0x16
> (P_KeyTable)
> >                                 resv....................0x0
> >                                 attr_mod................0x260000
> 
> Here is what is happening:  The attribute modifier for the P_KeyTable 
> attribute is divided into two, 16-bit halves.  The most significant 16
> bits 
> is information that is only valid for switches.  The problem here is
> that 
> this SubnGet is for an HCA.  The firmware currently sees that the
> upper 
> bits are non-zero and since it is not a switch, throws the packet
> away.  
> The proper response would be for it to ignore the upper bits and
> process 
> the MAD.  However, this is in firmware that won't be able to be
> changed 
> quickly.  So, in the meantime as a work around, would it be possible
> to 
> have the opensm clear out the upper 16 bits of the attribute modifier
> when 
> making a P_KeyTable request of an HCA?

I thought the IBM eHCA identified itself as both a switch and some
number of HCAs behind it. Are you sure this is a SubnSet P_KeyTable to a
HCA port ? If so, I will look at this and fix it so that even though
this should be ignored for HCA and router ports, it will be set to 0.

Troy, is there more of this log that can be sent ?

-- Hal

> Thanks,
> --Brad
> 
> ______________________________________________________________________
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general




More information about the general mailing list