[ofa-general] Re: [ewg] Update from September OpenFabrics Interoperability Event at UNH-IOL

Or Gerlitz ogerlitz at voltaire.com
Wed Oct 22 01:43:43 PDT 2008


Bob Noseworthy wrote:
> A bug for the observed IPoIB issue was logged last Friday,  and 
> updated yesterday confirming that RC3 still demonstrates the issue. 
> This is logged as https://bugs.openfabrics.org/show_bug.cgi?id=1287
>
> Further issues/observations from the recent OFA Interoperability Logo 
> Group's September Interoperability Event are at the end of this 
> email.  Summary of reported IPoIB issue:  If IPoIB datagram mode is 
> enabled,  and IP frames of 8K or larger are sent,  and no ARP entry 
> exists for the destination,  then the first IP frame is always lost 
> (ping used),  no matter what the timeout is set to (as high as 15s)
Looking in the code, the issue you report seems to be related to the 
length of internal queue used by ipoib to keep skbs whose neighbour 
doesn't have yet an IB Address-Handle (L2 info needed for xmit) 
associated with
> drivers/infiniband/ulp/ipoib/ipoib.h:	IPOIB_MAX_PATH_REC_QUEUE  = 3,
> drivers/infiniband/ulp/ipoib/ipoib_main.c if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE)
> drivers/infiniband/ulp/ipoib/ipoib_main.c: skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
> drivers/infiniband/ulp/ipoib/ipoib_main.c: if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE)
the current code will keep up to three skbs and then drop all the ones 
that follows till the point in time a reply for the driver path query is 
received from the SA. Unless I miss something, this code is there from 
day one (Q4/2005), do you claim that with older code drops this issue 
has not been observed? I am cc-ing here Roland, the maintainer of the 
driver, so you can check things with him.

> The following is a short summary of various updates from the September 
> OpenFabrics Interoperability Event.  Due to confidentiality reasons, 
> many details are occluded.  Per the request of the IWG on Oct 14, this 
> information is being shared with the EWG.
>
> Testing is ongoing with RC3 and future 1.4RCs on a best effort basis 
> until the GA, at which time the Logo Event will be held for those 
> participating.  If you have additional questions about these 
> comments,  the Interoperability Events, Logo Events,  or the OFA 
> Interoperability Test Plan, please feel free to contact us here at UNH-IOL
May I ask whose decision was it to test the Linux kernel RDMA stack in 
its "ofed" flavor and what was the reasonings behind it? the main-line 
kernel IB/iWARP code is well maintained and has an associated small 
supporting developer community. The ofed kernel bits contain code which 
was not accepted yet to the upstream kernel so you are actually testing 
not the product delivered by the ofa maintainers but rather a different 
creature, are you aware to that?


Or.




More information about the general mailing list