<br>The reason is:<br>Jan 01 01:46:17 321555 [58F3E280] -> osm_vendor_set_sm: ERR 5431: setting IS_SM capability mask failed; errno 2<br><br>From the code it looks like /dev/infiniband/issm<umad_port> needs to be created and I did that. But still the SM with higher GUID seem to become the master whenever it does a sweep. The logs are too detailed. So I am sending snippets.
<br><br><span style="font-weight: bold;">Local port (with a high GUID)</span><br>Jan 01 02:49:56 332142 [5873E280] -> osm_pi_rcv_process: Discovered port num 0x1 with GUID = 0x2c901097682d1 for parent node GUID = 0x2c901097682d0, TID = 0x1236
<br>Jan 01 02:49:56 332197 [5873E280] -> PortInfo dump:<br> port number.............0x1<br> node_guid...............0x0002c901097682d0<br> port_guid...............0x0002c901097682d1
<br> m_key...................0x0000000000000000<br> subnet_prefix...........0xfe80000000000000<br> <span style="font-weight: bold;">
base_lid................0x1</span><br style="font-weight: bold;"><span style="font-weight: bold;"> master_sm_base_lid......0x2</span><br> capability_mask.........0x2510A68
<br> diag_code...............0x0<br> m_key_lease_period......0x0<br> local_port_num..........0x1<br> link_width_enabled......0x3
<br> link_width_supported....0x3<br> link_width_active.......0x2<br> link_speed_supported....0x1<br> port_state..............ACTIVE
<br> state_info2.............0x52<br> m_key_protect_bits......0x0<br> lmc.....................0x0<br> link_speed..............0x11
<br> mtu_smsl................0x40<br> vl_cap_init_type........0x40<br> vl_high_limit...........0x0<br> vl_arb_high_cap.........0x8
<br> vl_arb_low_cap..........0x8<br> init_rep_mtu_cap........0x4<br> vl_stall_life...........0xFF<br> vl_enforce..............0x40
<br> m_key_violations........0x0<br> p_key_violations........0x0<br> q_key_violations........0x0<br> guid_cap................0x20
<br> client_reregister.......0x0<br> subnet_timeout..........0x12<br> resp_time_value.........0x10<br> error_threshold.........0x88
<br>Jan 01 02:49:56 332337 [5873E280] -> Capabilities Mask:<br> IB_PORT_CAP_HAS_TRAP<br> IB_PORT_CAP_HAS_AUTO_MIG<br> IB_PORT_CAP_HAS_SL_MAP
<br> IB_PORT_CAP_HAS_LED_INFO<br> IB_PORT_CAP_HAS_SYS_IMG_GUID<br> IB_PORT_CAP_HAS_COM_MGT<br> IB_PORT_CAP_HAS_VEND_CLS
<br> IB_PORT_CAP_HAS_CAP_NTC<br> IB_PORT_CAP_HAS_CLIENT_REREG<br><br>Remote Port which hosts the SM:<br>Jan 01 02:49:56 500638 [5AF3E280] -> osm_pi_rcv_process: Discovered port num 0x1 with GUID = 0x2c90109765da1 for parent node GUID = 0x2c90109765da0, TID = 0x123b
<br>Jan 01 02:49:56 500690 [5AF3E280] -> PortInfo dump:<br>Jan 01 02:49:56 500638 [5AF3E280] -> osm_pi_rcv_process: Discovered port num 0x1 with GUID = 0x2c90109765da1 for parent node GUID = 0x2c90109765da0, TID = 0x123b
<br>Jan 01 02:49:56 500690 [5AF3E280] -> PortInfo dump:<br> port number.............0x1<br> node_guid...............0x0002c90109765da0<br> port_guid...............0x0002c90109765da1
<br> m_key...................0x0000000000000000<br> subnet_prefix...........0xfe80000000000000<br> <span style="font-weight: bold;">
base_lid................0x2</span><br style="font-weight: bold;"><span style="font-weight: bold;"> master_sm_base_lid......0x2</span><br> capability_mask.........0x2510A68
<br> diag_code...............0x0<br> m_key_lease_period......0x0<br> local_port_num..........0x1<br> link_width_enabled......0x3
<br> link_width_supported....0x3<br> link_width_active.......0x2<br> link_speed_supported....0x1<br> port_state..............ACTIVE
<br> state_info2.............0x52<br> m_key_protect_bits......0x0<br> lmc.....................0x0<br> link_speed..............0x11
<br> mtu_smsl................0x40<br> vl_cap_init_type........0x40<br> vl_high_limit...........0x0<br> vl_arb_high_cap.........0x8
<br> vl_arb_low_cap..........0x8<br> init_rep_mtu_cap........0x4<br> vl_stall_life...........0xFF<br> vl_enforce..............0x40
<br> m_key_violations........0x0<br> p_key_violations........0x0<br> q_key_violations........0x0<br> guid_cap................0x20
<br> client_reregister.......0x0<br> subnet_timeout..........0x12<br> resp_time_value.........0x10<br> error_threshold.........0x88
<br>Jan 01 02:49:56 500831 [5AF3E280] -> Capabilities Mask:<br> IB_PORT_CAP_HAS_TRAP<br> IB_PORT_CAP_HAS_AUTO_MIG<br> IB_PORT_CAP_HAS_SL_MAP
<br> IB_PORT_CAP_HAS_LED_INFO<br> IB_PORT_CAP_HAS_SYS_IMG_GUID<br> IB_PORT_CAP_HAS_COM_MGT<br> IB_PORT_CAP_HAS_VEND_CLS
<br> IB_PORT_CAP_HAS_CAP_NTC<br> IB_PORT_CAP_HAS_CLIENT_REREG<br><br>Please let me know if I look at some specific portion.<br><br>Thanks<br>Ganesh<br><br><br>
<br><div><span class="gmail_quote">On 16 May 2007 21:57:27 -0400, <b class="gmail_sendername">Hal Rosenstock</b> <<a href="mailto:halr@voltaire.com">halr@voltaire.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi again Ganesh,<br><br>On Wed, 2007-05-16 at 21:42, Ganesh Sadasivan wrote:<br>> Hi Hal,<br>><br>> Please see inline.<br>><br>> On 16 May 2007 19:22:00 -0400, Hal Rosenstock <<a href="mailto:halr@voltaire.com">
halr@voltaire.com</a>><br>> wrote:<br>> Hi Ganesh,<br>><br>> On Wed, 2007-05-16 at 19:00, Ganesh Sadasivan wrote:<br>> > Hi,<br>> ><br>> > I have a setup with 2 HCAs connected back to back and am
<br>> running<br>> > opensm (ofed1.1, running at the same priority) on both of<br>> them. Is<br>> > there any utility to see who is the master?<br>><br>> Even with priority difeferences I am seeing the same
behavior.Am I<br>> missing any option. I am setting "opensm -s 30" and "opensm -s 60" on<br>> the respective sides.<br><br>Why not use the default (10 secs) or at least the same on both sides ?<br>
<br>> sminfo will show the SM state for a LID/GUID.<br>><br>><br>> Thanks.<br>><br>> > The smlid in ibv_devinfo, seems to be changing whenever an<br>> SM does a<br>> > sweep. Is this expected?
<br>><br>> Nope. If they are both at the same priority, the lower GUID<br>> should win<br>> the SM election.<br>><br>> Not sure what is going wrong in your (back to back HCA)
<br>> subnet. Do you<br>> ports stay active ?<br>><br>><br>> Yes both ports are active.<br><br>And they stay active (no LED color changes) ?<br><br>If not, can you run both OpenSMs in verbose mode (-V) and see if there
<br>is anything interesting/relevant in the logs ?<br><br>-- Hal<br><br>> Thanks<br>> Ganesh<br>><br>> -- Hal<br>><br>> > Thanks<br>> > Ganesh<br>> ><br>> >
<br>> ______________________________________________________________________<br>> > _______________________________________________<br>> > general mailing list<br>> > <a href="mailto:general@lists.openfabrics.org">
general@lists.openfabrics.org</a><br>> ><br>> <a href="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general</a><br>> >
<br>> > To unsubscribe, please visit<br>> <a href="http://openib.org/mailman/listinfo/openib-general">http://openib.org/mailman/listinfo/openib-general</a><br>><br>><br><br></blockquote></div>
<br>