[ofa-general] Multicast traffic generates Bad P_Key trap in SM when working in partial member setup
hrosenstock at xsigo.com
Thu Jun 12 03:30:26 PDT 2008
On Thu, 2008-06-12 at 09:46 +0300, Olga Shern wrote:
> Hi All,
> We have found something that seems like Infiniband Spec hole,
What's the spec hole ?
> This issue is system issue that prevents from partial P_Key setup to
> go into production.
> Short Setup & test description:
> * Node A: P_Key XXX (full member)
> * Node B, C, D, E, F: P_Key XXx (partial member)
> 1. Send ping from B -> A : ping is OK
> 2. Send ping from C -> A : ping is OK
> 3. Send ping from B -> C : no ping also OK
> * Get traps Bad P_Key in SM - from all HCA in the fabric both for
> test 1 & 2 (one time) and also for test 3 (all the time).
> Probably the ARP request that is MC traffic generate the trap in HCA,
> for test 1
> & 2 we have only one ARP but for test 3 we send ARP all the time
> we do not get any ARP reply.
> * The trap number SM get is 257 (HCA trap) if we will do P_Key
> switch enforcement we will probably get 259
Is this with OpenSM or VSM ?
> * We get trap also from the originator of the MC traffic even
> though that receive switch relay error counter is increased (when out
> port==in port), the switch does not drop the packet ?
> Additional questions/issues:
> * Do we have a way to suppress port traps from SMA ?? i.e. that
> the port will not generate traps that can "kill the SM" - as its look
> this is bug in the spec where we can't send any mc traffic (even ARP)
> when we have partial members and we do not have a way to suppress the
> * What will happen in the HCA when we get many traps (mc packets
> from many nodes) and they need to keep all events until SM will
> acknowledge? - Is there limitation in the number of on-going
> traps (any HCA specific issues)?
> Best Regards
> general mailing list
> general at lists.openfabrics.org
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
More information about the general