[openib-general] Re: RMPP Message Format Errors

Hal Rosenstock halr at voltaire.com
Tue Sep 20 02:31:40 PDT 2005


On Tue, 2005-09-20 at 04:48, Eitan Zahavi wrote:
> Hi Hal,
> 
> Seems like RMPP works !

Yippee :-)

> This is an important milestone for OpenSM as we are now able to test the SM/SA with osmtest.

and also for Solaris.

> There is still some constant 8 bytes remainder in the RMPP number of received records calculation
> (see osmtest -V log file) but this is minor (as no SA record is that small).

It sounds like there is still a calculation slightly off.

I don't see a constant off by 8 remainder issue. In my configuration
most seem fine and the only one which is not off by 20 (SA class header
size) is the following:

Sep 20 05:17:36 292850 [40FFF960] -> osm_vendor_get: Acquired UMAD 0x53cd40, size = 856.
Sep 20 05:17:36 292861 [40FFF960] -> osm_vendor_get: ]
Sep 20 05:17:36 292870 [40FFF960] -> osm_mad_pool_get: Acquired p_madw = 0x536190, p_mad = 0x53cd78, size = 856. 
Sep 20 05:17:36 292880 [40FFF960] -> osm_mad_pool_get: ]
Sep 20 05:17:36 292889 [40FFF960] -> __osmv_sa_mad_rcv_cb: [
Sep 20 05:17:36 292899 [40FFF960] -> __osmv_sa_mad_rcv_cb: Count = 7 = 800 / 112 (16)
Sep 20 05:17:36 292909 [40FFF960] -> osmtest_query_res_cb: [
Sep 20 05:17:36 292918 [40FFF960] -> osmtest_query_res_cb: ]
Sep 20 05:17:36 292932 [40FFF960] -> __osmv_sa_mad_rcv_cb: ]
Sep 20 05:17:36 292938 [AB001140] -> __osmv_send_sa_req: ]
Sep 20 05:17:36 292971 [AB001140] -> osmv_query_sa: ]
Sep 20 05:17:36 292980 [AB001140] -> osmtest_get_all_recs: ]
Sep 20 05:17:36 292989 [AB001140] -> osmtest_validate_all_node_recs: Received 7 records.

Is this what you are referring to ?

I do also see:
Sep 20 05:16:40 995667 [AB001140] -> osmt_get_service_by_name: ERR 0370: ib_query failed (IB_REMOTE_ERROR).
Sep 20 05:16:40 995673 [AB001140] -> osmt_get_service_by_name: Remote error = IB_SA_MAD_STATUS_NO_RECORDS.
Sep 20 05:16:40 995678 [AB001140] -> osmt_get_service_by_name: Expected num of records is : 1, Found number of records : 0

and some timeouts:
Sep 20 05:17:40 644730 [40FFF960] -> umad_receiver: ERR 5409: send completed with error (method=1 attr=12) -- dropping.
Sep 20 05:17:40 644740 [40FFF960] -> umad_receiver: ERR 5410: class 0x3 LID 0x0
Sep 20 05:17:40 644750 [40FFF960] -> __osmv_sa_mad_err_cb: [
Sep 20 05:17:40 644760 [40FFF960] -> osmtest_query_res_cb: [
Sep 20 05:17:40 644769 [40FFF960] -> osmtest_query_res_cb: ERR 0003: Error on query (IB_TIMEOUT).
Sep 20 05:17:40 644787 [40FFF960] -> osmtest_query_res_cb: ]
Sep 20 05:17:40 644801 [40FFF960] -> __osmv_sa_mad_err_cb: ]
which then resulted in:
Sep 20 05:17:40 644955 [AB001140] -> osmtest_wrong_sm_key_ignored: ERR 0011: Did not get a timeout but got (IB_SUCCESS).

> Thanks for your continuous support.
> 
> Eitan
> 
> Hal Rosenstock wrote:
> > Hi Eitan,
> > 
> > The send side RMPP changes for the truncation of the last SA
> > record have now stabilized. With the latest user_mad.c and
> > osm_vendor_ibumad.c changes which are in the OpenIB svn tree (svn
> > revision 3485), this is ready to be verified again. It safe to come out
> > now :-)
> > 
> > -- Hal
> > 
> 




More information about the general mailing list