[openib-general] RMPP Message Format Errors
Eitan Zahavi
eitan at mellanox.co.il
Thu Sep 15 01:01:47 PDT 2005
Hal Rosenstock wrote:
>
>
> No. The patches are part of this. It would depend on what OpenIB svn
> version you are running with but if it is a recent pull then they are
> all there.
OK I got the kernel restarted now. From the Analyzer dump I can see the intermediate segments
paylen is 0 so I guess I'm up to date. But the osmtest produces an inventory file that misses
some of the records being sent.
Now lets go back to the test:
I use a machine connected through a single switch (IS3) to itself.
I use osmtest -f c to get Nodes,Ports and PathRecords from the SM.
From OpenSM Log file I see:
Sep 15 09:47:37 531029 [8003] -> osm_nr_rcv_process: Returning 3 records.
Sep 15 09:47:37 538586 [C004] -> osm_pir_rcv_process: Returning 27 records.
So we can conclude the following RMPP transactions should be sent:
1. NodeRec:
attrOffset is 14 and each record size with padding is 112bytes.
The RMPP with 336byte data should require 2 segments = ceiling(336/200).
First segment paylen should be 336 + 2 * 20 = 376.
Last segment paylen should be 336 - 200 + 20 = 156.
2. PortInfoRecords:
attrOffset is 8 and each record size with padding is 64bytes.
The RMPP with 1728 = 27 * 64byte data should require 9 segments = ceiling(1728/200).
First segment paylen should be 1728 + 9 * 20 = 1908.
Lat segment paylen should be 1728 - 8*200 + 20 = 148.
What we see in the attached analyzer capture:
NodeInfoRec
Attr Expected Measured
Num Segments 2 2
First Paylen 376 376
Last Paylen 156 156
PortInfoRec
Attr Expected Measured
Num Segments 9 9
First Paylen 1908 1908
Last Paylen 148 148
So the response on the wire is 100% OK. Thanks Sean.
Now I go to the SA client section:
From osmtest log I see:
NodeInfoRec:
Aug 21 14:46:56 [4017F6C0] -> __osmv_send_sa_req: Waiting for async event.
Aug 21 14:46:56 [40D87BB0] -> osm_mad_pool_get: [
Aug 21 14:46:56 [40D87BB0] -> osm_vendor_get: [
Aug 21 14:46:56 [40D87BB0] -> osm_vendor_get: Acquiring UMAD for p_madw = 0x807b8a4, size = 256.
Aug 21 14:46:56 [40D87BB0] -> osm_vendor_get: Acquired UMAD 0x807c198, size = 256.
Aug 21 14:46:56 [40D87BB0] -> osm_vendor_get: ]
Aug 21 14:46:56 [40D87BB0] -> osm_mad_pool_get: Acquired p_madw = 0x807b898, p_mad = 0x807c1d0, size = 256.
Aug 21 14:46:56 [40D87BB0] -> osm_mad_pool_get: ]
Aug 21 14:46:56 [40D87BB0] -> __osmv_sa_mad_rcv_cb: [
Aug 21 14:46:56 [40D87BB0] -> __osmv_sa_mad_rcv_cb: Count = 1 = 200 / 112 (88)
Aug 21 14:46:56 [40D87BB0] -> osmtest_query_res_cb: [
Aug 21 14:46:56 [40D87BB0] -> osmtest_query_res_cb: ]
Aug 21 14:46:56 [40D87BB0] -> __osmv_sa_mad_rcv_cb: ]
I wonder how come the received MAD is only of 256 bytes. I expected it to be of headers + data = 56 + 336 = 392byte.
So my conclusion is that for some reason the response MAD is not re-assembled correctly or the communication between the
assembly to the umad layer is broken.
Or maybe I am missing some patches.
I see that in the osm_vendor_ibumad.c the receive flow is allocating a MAD using:
p_osm_madw = osm_mad_pool_get_wrapper(p_mad_bind_info->p_mad_pool,
p_mad_bind_info,
MAD_BLOCK_SIZE,
(ib_mad_t*)&pRecvMad->IBMad,
&osm_mad_addr);
I suspect the allocation should use the receive mad size.
Thanks
Eitan
More information about the general
mailing list