Hal,<br>
<br>
I added a hack now to get around the problem. There needs to be a proper fix later..<br>
<br>
[root@ibstg1 opensm]# svn diff osm_port.h<br>
Index: osm_port.h<br>
===================================================================<br>
--- osm_port.h (revision 3549)<br>
+++ osm_port.h (working copy)<br>
@@ -1049,6 +1049,8 @@<br>
{<br>
CL_ASSERT( p_physp );<br>
CL_ASSERT( osm_physp_is_valid( p_physp ) );<br>
+ if (p_physp->port_info.base_lid == 0xFFFF)<br>
+ return (0);<br>
return( p_physp->port_info.base_lid );<br>
}<br>
/*<br>
<br><br><div><span class="gmail_quote">On 27 Sep 2005 15:11:05 -0400, <b class="gmail_sendername">Hal Rosenstock</b> <<a href="mailto:halr@voltaire.com">halr@voltaire.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Tue, 2005-09-27 at 14:13, Viswanath Krishnamurthy wrote:<br>> I tracked down the issue to a bug in osm_lid_mgr.c<br>><br>> function: __osm_lid_mgr_init_sweep(...)<br>><br>> The bad hardware was retutning an assigned LID of 0xFFFF. In this
<br>> function there is a loop<br>> as follows where opensm is getting stuck.. (with line number)<br>><br>> 392 p_port_guid_tbl = &p_mgr->p_subn->port_guid_tbl;<br>> 393<br>> 394 for( p_port = (osm_port_t*)cl_qmap_head( p_port_guid_tbl );
<br>>
395 p_port !=
(osm_port_t*)cl_qmap_end( p_port_guid_tbl );<br>>
396 p_port =
(osm_port_t*)cl_qmap_next( &p_port->map_item )<br>> )<br>> 397 {<br>> 398 osm_port_get_lid_range_ho(p_port, &disc_min_lid,<br>> &disc_max_lid);<br>> 399 for (lid = disc_min_lid; lid <= disc_max_lid;
<br>>
lid++) <=====
Bug here<br>> 400 cl_ptr_vector_set(p_discovered_vec, lid, p_port );<br>> 401 }<br>><br>> Since the disc_max_lid and disc_min_lid are 0xFFFF, and these are<br>> unsigned 16 bit numbers, the condition
<br>> in the for loop never becomes false, and opensm is stuck in the loop.<br>> There are couple of other places in that<br>> function that needs fixing too.<br><br>Sep 26 15:26:03 424135 [B66CFBB0] -> SMP dump:
<br> base_ver................0x1<br> mgmt_class..............0x81<br> class_ver...............0x1<br> method..................0x1
(SubnGet)<br> D
bit...................0x0<br> status..................0x0<br> hop_ptr.................0x0<br> hop_count...............0x2<br> trans_id................0x1274
<br> attr_id.................0x15
(PortInfo)<br> resv....................0x0<br> attr_mod................0x1<br> m_key...................0x0000000000000000<br> dr_slid.................0xFFFF
<br> dr_dlid.................0xFFFF<br><br><br>Sep 26 15:26:03 424407 [B6ED0BB0] -> __osm_nd_rcv_process_nd: Node 0x30d300002c7234<br> Description
= Agilent E2954A 4x Generator for InfiniBand.<br>Sep 26 15:26:03 424426 [B6ED0BB0] -> __osm_nd_rcv_process_nd: ]<br><br>Sep 26 15:26:03 679882 [B56CDBB0] -> SMP dump:<br> base_ver................0x1
<br> mgmt_class..............0x81<br> class_ver...............0x1<br> method..................0x81
(SubnGetResp)<br> D
bit...................0x1<br> status..................0x0<br> hop_ptr.................0x0<br> hop_count...............0x2<br> trans_id................0x1274
<br> attr_id.................0x15
(PortInfo)<br> resv....................0x0<br> attr_mod................0x1<br> m_key...................0x0000000000000000<br> dr_slid.................0xFFFF
<br> dr_dlid.................0xFFFF<br><br> Initial
path: [0][1][12]<br> Return
path: [0][E][0]<br><br><br>Sep 26 15:26:03 680291 [B76D1BB0] -> osm_pi_rcv_process: [<br>Sep 26 15:26:03 680323 [B56CDBB0] -> __osm_sm_mad_ctrl_rcv_callback: ]<br>Sep 26 15:26:03 680343 [B76D1BB0] -> PortInfo dump:
<br> port
number.............0x1<br> node_guid...............0x0030d300002c7234<br> port_guid...............0x0030d300002c7234<br> m_key...................0x0000000000000000
<br> subnet_prefix...........0xfe80000000000000<br> base_lid................0xFFFF<br><br>Yes, it appears the Agilent exerciser returned good status to a SM Get
<br>PortInfo with a base_lid of 0xffff. The base_lid should be validated by<br>OpenSM.<br><br>-- Hal<br><br></blockquote></div><br>