[ofw] OpenSM 3.3.11 and osmtest interaction.

Smith, Stan stan.smith at intel.com
Wed Sep 28 16:26:49 PDT 2011


Hello,
  In porting Opensm 3.3.11 to Windows the following MC osmtest failure occurred.

'osmtest -f m -M1' kept failing [ERR 0210] due to opensm 3.3.11 failing the MC group create as PKey == 0.

Specifically the ib_pkey_is_invalid() call @ line 1026 in osm_sa_mcmember.c returned TRUE?
Turns out the opensm  p_recvd_mcmember_rec->pkey == 0, as it was set in osmtest.
 [osmt_multicast.c in the call to osmt_init_mc_memory() @ line 1427].

The Windows fix was to 'mc_req_rec.pkey = IB_DEFAULT_PKEY' prior to calling osmt_send_mcast_request().
The fix needed to be applied in a few places;  now all osmtests are passing.

Thoughts on the failures?

Sean Hefty did a OFED for Linux test using head of the opensm src tree:

osmtest -f m -M1

Sep 28 15:43:19 736239 [6E1F3700] 0x02 -> osmt_run_mcast_flow: Checking Create given MGID=0 valid Set several options :
                First above min RATE, Second less than max RATE
                Third above min MTU, Fourth less than max MTU
                Fifth exact MTU & RATE feasible, Sixth exact RATE feasible
                Seventh exact MTU feasible (o15.0.1.4)...
Sep 28 15:43:19 737661 [6E1F3700] 0x02 -> osmt_run_mcast_flow: Validating resulting MGID (o15.0.1.5)...
Sep 28 15:43:19 737720 [6E1F3700] 0x02 -> osmt_run_mcast_flow: Checking Create given MGID=0 (o15.0.1.4)...
Sep 28 15:43:19 738032 [6D9F0710] 0x01 -> __osmv_sa_mad_rcv_cb: ERR 5501: Remote error:0x0200
Sep 28 15:43:19 738054 [6D9F0710] 0x01 -> osmtest_query_res_cb: ERR 0003: Error on query (IB_REMOTE_ERROR)
Sep 28 15:43:19 738082 [6E1F3700] 0x01 -> osmt_send_mcast_request: ERR 0224: ib_query failed (IB_REMOTE_ERROR)
Sep 28 15:43:19 738110 [6E1F3700] 0x01 -> osmt_send_mcast_request: Remote error = IB_SA_MAD_STATUS_REQ_INVALID
Sep 28 15:43:19 738134 [6E1F3700] 0x01 -> osmt_run_mcast_flow: ERR 0210: Failed to create MCG for MGID=0 - got IB_REMOTE_ERROR/IB_SA_MAD_STATUS_REQ_INVALID
Sep 28 15:43:19 738162 [6E1F3700] 0x01 -> osmtest_run: ERR 0152: Multicast Flow failed: (IB_REMOTE_ERROR)
OSMTEST: TEST "Multicast" FAIL

Not a patch, only reference points to what I did to fix the issue in Windows.

--- F:/OSM/opensm-3.3.11/osmtest/osmt_multicast.c	Wed Sep 28 16:16:25 2011
+++ F:/openIB-windows-svn/latest/gen1/trunk/ulp/opensm/userX/osmtest/osmt_multicast.c	Wed Sep 28 14:23:57 2011
@@ -768,8 +768,8 @@
 	    IB_MCR_COMPMASK_RATE_SEL | IB_MCR_COMPMASK_RATE;
 
 	OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, EXPECTING_ERRORS_START "\n");
-	status = osmt_send_mcast_request(p_osmt, 1, &mc_req_rec, comp_mask,
-					 sa_mad);
+
+	status = osmt_send_mcast_request(p_osmt, 1, &mc_req_rec, comp_mask, sa_mad);
 	OSM_LOG(&p_osmt->log, OSM_LOG_ERROR, EXPECTING_ERRORS_END "\n");
 
 	if (((ib_net16_t) (sa_mad->status & IB_SMP_STATUS_MASK)) !=
@@ -1429,6 +1429,7 @@
 	/* no MGID */
 	memset(&mc_req_rec.mgid, 0, sizeof(ib_gid_t));
 	/* Request Join */
+	mc_req_rec.pkey = IB_DEFAULT_PKEY;
 	ib_member_set_join_state(&mc_req_rec, IB_MC_REC_STATE_FULL_MEMBER);
 
 	mc_req_rec.pkt_life = 0 | IB_PATH_SELECTOR_GREATER_THAN << 6;
@@ -1455,6 +1456,7 @@
 	/* o15.0.1.6: */
 	/* - Create a new MCG with valid requested MGID. */
 	osmt_init_mc_query_rec(p_osmt, &mc_req_rec);
+	mc_req_rec.pkey = IB_DEFAULT_PKEY;
 	mc_req_rec.mgid = good_mgid;
 
 	OSM_LOG(&p_osmt->log, OSM_LOG_INFO,
@@ -2221,6 +2223,7 @@
 		"\t\twith unrealistic MTU greater than 4096 (o15.0.1.8)...\n");
 
 	/* First create new mgrp */
+	mc_req_rec.pkey = IB_DEFAULT_PKEY;
 	ib_member_set_join_state(&mc_req_rec, IB_MC_REC_STATE_FULL_MEMBER);
 	mc_req_rec.mtu = IB_MTU_LEN_1024 | IB_PATH_SELECTOR_EXACTLY << 6;
 	memset(&mc_req_rec.mgid, 0, sizeof(ib_gid_t));
@@ -2308,6 +2311,7 @@
 	}
 
 	if (remote_port_guid != 0x0) {
+		mc_req_rec.pkey = IB_DEFAULT_PKEY;
 		ib_member_set_join_state(&mc_req_rec,
 					 IB_MC_REC_STATE_FULL_MEMBER);
 		memset(&mc_req_rec.mgid, 0, sizeof(ib_gid_t));


Thanks,

Stan.








More information about the ofw mailing list