[ewg] bug 418: was OFED 1.2 beta blocking bugs

Sean Hefty mshefty at ichips.intel.com
Fri Mar 9 09:13:37 PST 2007


> IP does this fragmentation and would need to do this to UD MTU size not
> the RC (CM) size.

Agreed.  I'm not familiar enough with the network stack to know how it handles 
the MTU sizes to different destinations or across different paths.  The IPoIB CM 
code displays a warning when the MTU is set larger than the MTU of the device, 
since this results in dropping large multicast packets.  So at least these 
messages that were reported on the send size make sense:

	ib0: enabling connected mode will cause multicast packet drops
	ib0: mtu > 2044 will cause multicast packet drops.
	ib0: packet len 4100 (> 2064) too long to send, dropping
	ib0: packet len 4100 (> 2064) too long to send, dropping

The original bug report simply stated that this behavior is undesirable as the 
default, and that IPoIB should either use UD or the CM should be fixed to avoid 
this situation.  (The fix may be a change in some setup rather than a change in 
the code, but I don't know.)

> I would as IPoIB-CM does not support MC and there needs to be a fallback
> to UD. I'm not sure where IPmc gets the device MTU from but it may be
> the wrong one (from CM rather than the normal UD interface). Has anyone
> tracked this down ? I haven't had a chance to look yet.

I'm still studying it, but the behavior on the send side seems to make sense 
after reading the code.  I'm more interested in the errors on the 'receive' 
side, which at least to me, appear as a different issue:

	ib0: failed send event (status=1, wrid=35 vend_err 69)
	ib_mthca 0000:08:00.0: modify QP 3->3 returned status 10.
	ib0: failed to modify QP, ret = -22
	ib0: couldn't attach QP to multicast group
	ff12:401b:ffff:0000:0000:0000:0001:0101
	ib0: multicast join failed for
		ff12:401b:ffff:0000:0000:0000:0001:0101, status -22

- Sean





More information about the ewg mailing list