[ofa-general] Multicast traffic generates Bad P_Key trap in SM when working in partial member setup

Olga Shern olgas at voltaire.com
Wed Jun 11 23:46:18 PDT 2008


Hi All,

 

We have found something that seems like Infiniband Spec hole,

This issue is system issue that prevents from partial P_Key setup to
go into production.


Short Setup & test description:
------------------------------------------
* Node A: P_Key XXX (full member)
* Node B, C, D, E, F: P_Key XXx (partial member)

1. Send ping from B -> A : ping is OK
2. Send ping from C -> A : ping is OK
3. Send ping from B -> C  : no ping also OK
* Get traps Bad P_Key in SM - from all HCA in the fabric both for
test 1 & 2 (one time) and also for test 3 (all the time).

Probably the ARP request that is MC traffic generate the trap in HCA,
for test 1
& 2 we have only one ARP but for test 3 we send ARP all the time because
we do not get any ARP reply.

* The trap number SM get is 257 (HCA trap) if we will do P_Key
switch enforcement we will probably get 259
.
* We get trap also from the originator of the MC traffic even
though that receive switch relay error counter is increased (when out
port==in port), the switch does not drop the packet ?

Additional questions/issues:
* Do we have a way to suppress port traps from SMA ?? i.e. that
the port will not generate traps that can "kill the SM" - as its look
this is bug in the spec where we can't send any mc traffic (even ARP)
when we have partial members and we do not have a way to suppress the
traps.


* What will happen in the HCA when we get many traps (mc packets
from many nodes) and they need to keep all events until SM will
acknowledge?  - Is there limitation in the number of on-going
traps (any HCA specific issues)?

 

 

Best Regards 

Olga

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20080612/352bf73e/attachment.html>


More information about the general mailing list