[ewg] bug 418: was OFED 1.2 beta blocking bugs
Sean Hefty
mshefty at ichips.intel.com
Fri Mar 9 09:13:37 PST 2007
> IP does this fragmentation and would need to do this to UD MTU size not
> the RC (CM) size.
Agreed. I'm not familiar enough with the network stack to know how it handles
the MTU sizes to different destinations or across different paths. The IPoIB CM
code displays a warning when the MTU is set larger than the MTU of the device,
since this results in dropping large multicast packets. So at least these
messages that were reported on the send size make sense:
ib0: enabling connected mode will cause multicast packet drops
ib0: mtu > 2044 will cause multicast packet drops.
ib0: packet len 4100 (> 2064) too long to send, dropping
ib0: packet len 4100 (> 2064) too long to send, dropping
The original bug report simply stated that this behavior is undesirable as the
default, and that IPoIB should either use UD or the CM should be fixed to avoid
this situation. (The fix may be a change in some setup rather than a change in
the code, but I don't know.)
> I would as IPoIB-CM does not support MC and there needs to be a fallback
> to UD. I'm not sure where IPmc gets the device MTU from but it may be
> the wrong one (from CM rather than the normal UD interface). Has anyone
> tracked this down ? I haven't had a chance to look yet.
I'm still studying it, but the behavior on the send side seems to make sense
after reading the code. I'm more interested in the errors on the 'receive'
side, which at least to me, appear as a different issue:
ib0: failed send event (status=1, wrid=35 vend_err 69)
ib_mthca 0000:08:00.0: modify QP 3->3 returned status 10.
ib0: failed to modify QP, ret = -22
ib0: couldn't attach QP to multicast group
ff12:401b:ffff:0000:0000:0000:0001:0101
ib0: multicast join failed for
ff12:401b:ffff:0000:0000:0000:0001:0101, status -22
- Sean
More information about the ewg
mailing list