[Users] Troubles with ibsim

Hal Rosenstock hal.rosenstock at gmail.com
Thu Feb 15 12:36:59 PST 2018


Hi Tim,

Attribute ID 0xff17 is in the vendor specific range for SM attributes and
not supported with (at least) the upstream ibsim.

I think you are using MLNX OpenSM rather than upstream or OFED OpenSM with
the upstream ibsim. I'm not sure if MLNX ibsim supports the additional
vendor specific SM attributes or not.

Can you work with some upstream or OFED OpenSM or only MLNX OpenSM ? If
not, I try to find out whether using the MLNX OFED ibsim supports the
additional attributes for running MLNX OpenSM.

-- Hal


On Thu, Feb 15, 2018 at 11:50 AM, Tim Miller <btmiller at helix.nih.gov> wrote:

> I am attempting to use ibsim to test some possible configuration changes
> in our routing, but I am running into some difficulties. I can get the
> simulator started, but opensm fails to discover the fabric in the simulated
> environment. It discovers the switch to which the host running opensm is
> connected, but it can't discover any further than that. In the opensm log,
> I see:
>
> Feb 14 16:31:50 047307 [AD332700] 0x04 -> ni_rcv_process_new: Discovered
> new Switch node,
>                                 GUID 0x7cfe900300b49890, TID 0x1239
> Feb 14 16:31:50 047821 [AD533700] 0x04 -> nd_rcv_process_nd: Node
> 0x7cfe900300b49890
>                                 Description = SwitchIB Mellanox
> Technologies
> Feb 14 16:31:50 047847 [B5974700] 0x01 -> log_send_error: ERR 5411: DR SMP
> Send completed with error (IB_TIMEOUT) -- dropping
>                         Method 0x1, Attr 0xFF17, TID 0x123b
> Feb 14 16:31:50 047866 [B5974700] 0x01 -> Received SMP on a 1 hop path:
> Initial path = 0,1, Return path  = 0,0
> Feb 14 16:31:50 047893 [B5974700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR
> 3113: MAD completed in error (IB_TIMEOUT): SubnGet(GeneralInfo), attr_mod
> 0x4, TID 0x123b
> Feb 14 16:31:50 047913 [B5974700] 0x04 -> osm_hm_set_by_physp: Remote port
> of  0x7cfe900300b49890[0] couldn't be found
> Feb 14 16:31:50 047921 [B5974700] 0x01 -> sm_mad_ctrl_send_err_cb: ERR
> 3120: Timeout while getting attribute 0xFF17 (GeneralInfo); Possible
> mis-set mkey?
> Feb 14 16:31:50 047927 [B5974700] 0x01 -> sm_mad_ctrl_send_err_cb: Error
> during initialization: got General Info time out from node
> 0x7cfe900300b49890
>
> And in the simulator console, I see messages of the form.
>
> ibwarn: [32331] process_packet: no one to handle pkt: class 0x81, attr
> 0xff17
>
> Looking at the output of the "dump" command from within the console, it
> shows that all ports are in Init/LinkUp, except for the SMA port, which is
> in state Active/LinkUp.
>
> Does anyone have any idea what I might be doing wrong here?
>
> Thanks,
> Tim
>
> --
> Tim Miller
> NIH HPC systems staff
> 301-827-5261
> https://hpc.nih.gov
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20180215/5ca48f26/attachment.html>


More information about the Users mailing list