[openib-general] RMPP Message Format Errors

Eitan Zahavi eitan at mellanox.co.il
Wed Sep 14 08:08:27 PDT 2005


Hi Hal,Sean

I tested today what I think is the trunk Gen2 core with trunk OpenSM and
still see some RMPP packet issues. It looks from the osmtest log file that
the calculation of the reassembled packet size as there is always some
extra bytes:

Aug 21 14:46:56 [40D87BB0] -> __osmv_sa_mad_rcv_cb: Count = 1 = 200 / 112 (88)
Aug 21 14:46:56 [4017F6C0] -> osmtest_write_all_node_recs: Received 1 records.

Aug 21 14:46:56 [40D87BB0] -> __osmv_sa_mad_rcv_cb: Count = 3 = 200 / 64 (8)
Aug 21 14:46:56 [4017F6C0] -> osmtest_write_all_port_recs: Received 3 records.

I attach also some Analyzer file that show paylen is wrong,

Is it possible the patches (of the core) that fix the rmpp paylen are not
part of the main trunk?

Thanks

Eitan




Eitan Zahavi wrote:
> Hi Sean, Hal,
> 
> We have started testing RMPP packets with osmtest and opensm (gen2
> version).
> 
> We did not go very far. The first NodeRecord GetTable of all the nodes
> in a "loopback" case, has some issues.
> 
> The explanation is below:
> 
> 1.      NodeRecord MAD size is 112bytes (note the required padding of 4
> bytes at the end of the NodeRec data). 
> 2.      OpenSM log file shows the query should return 2 records one for
> each end-port. This really happens: 
> 
> 
> 	Aug 21 14:59:49 998104 [40D9DBB0] -> __osm_nr_rcv_create_nr:
> Looking for NodeRecord with LID: 0x0 GUID:0x0000000000000000
> 
> 	Aug 21 14:59:49 998224 [40D9DBB0] -> __osm_nr_rcv_new_nr: New
> NodeRecord: node 0x0002c902000017a0
> 
> 	                                port 0x0002c902000017a1, lid
> 0x1.
> 
> 	Aug 21 14:59:49 998327 [40D9DBB0] -> __osm_nr_rcv_new_nr: New
> NodeRecord: node 0x0002c902000017a0
> 
> 	                                port 0x0002c902000017a2, lid
> 0x2.
> 
> 	Aug 21 14:59:49 998395 [40D9DBB0] -> osm_nr_rcv_process:
> Returning 2 records.
> 
> 3.      On the wire we see the following (see attached gif for more
> details): 
> a.      Two data segments were sent and two ACKs were returned. This is
> OK. 
> b.      The first segment reports PayLen = 440bytes. According to the
> spec the first segment might provide paylen != 0 and when it is done it
> should be equal to the (class header * Num-Segments) + data length. In
> our case we have data length = 2*112, and SA extra header = 20byte *
> 2seg. This leads to peylen=264 and not 440!!!
> The spec defines that in p775-l37.
> So this is a violation of the spec. 
> c.      The last segment (segment 2) provides the paylen field of 100.
> The expected value for the last segment length should have been: SA
> extra header + leftover data size from prev segments. Since the first
> segment has 200bytes for data the left over should have been 112*2 - 200
> = 24. With the SA extra header 44bytes.
> So this is another violation of the spec. 
> d.      The analyzer is confused by the above and reports the result as
> having 3 NodeRecords. 
> e.      <<Gen2 NodeRec GetTable RMPP Format Error.GIF>> 
> 4.      Following that when we trace the log file of osmtest we find
> more issues. Probably caused by changes to the vendor layer or the rmpp
> assembly: It is expected that after assembly the size of the RMPP mad
> reported to the osm vendor layer will be the rmpp header + SA extra
> header + data-size. In our case that is 32 + 20 + 2*112 = 276. 
> 
> 	The log file shows:
> 
> 	Aug 21 14:59:49 [40D87BB0] -> __osmv_sa_mad_rcv_cb: Count = 1 =
> 200 / 112 (88)
> 
> 	Aug 21 14:59:49 [4017F6C0] -> osmtest_write_all_node_recs:
> Received 1 records
> 
> 	So this is another problem - probably with the way RMPP results
> are assembled or pass back to the vendor.
> 
> Please let me know if you will have time to dig into these problems or
> if I should try and resolve them myself and provide patches. 
> 
> Thanks
> 
> Eitan
> 
> Eitan Zahavi
> 
> Design Technology Director
> 
> Mellanox Technologies LTD
> 
> Tel:+972-4-9097208
> Fax:+972-4-9593245
> 
> P.O. Box 586 Yokneam 20692 ISRAEL
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> openib-general mailing list
> openib-general at openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-------------- next part --------------
A non-text attachment was scrubbed...
Name: gen2 rmpp 14 sep 2005.iba
Type: application/octet-stream
Size: 15449 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20050914/f1064c15/attachment.obj>


More information about the general mailing list