[Users] ibsim updates routing tables?
Albert Chu
chu11 at llnl.gov
Fri Oct 25 11:35:51 PDT 2013
Hi Robert,
> I'm trying to test routing in ibsim, but it doesn't seem to update the
> routing tables in the simulated switches. If I take a link down using
> the clear command in ibsim, I see opensm saying that it is updating
> the routing tables and that it completes, but I can't ibtracert to the
> LID who's path was taken down.
I have a feeling you might be confusing ibtracert's behavior w/ the
typical behavior of traceroute.
When you disable the link below, you are effectively taking node(s) out
of your fabric. OpenSM will see that the node(s) disappeared and will
re-route the fabric. Those nodes are now eliminated from all of the
routing tables. So when you ibtracert that node, ibtracert effectively
states it can't do a traceroute b/c the node/route doesn't exist.
This is different than traceroute, which output the network hops as far
as it can go, even if the end destination is down.
Al
On Fri, 2013-10-25 at 12:22 -0600, Robert LeBlanc wrote:
> I just realized that in this example I'm shutting down the entire
> switch that the host is connected to instead of the uplink port. If I
> issue 'clear "S-0002c90300684e30" 2"', I get the same result. Port 1
> and 2 are both uplink ports to different leaf IB switches in a fat
> tree scheme.
>
>
>
> Robert LeBlanc
> OIT Infrastructure & Virtualization Engineer
> Brigham Young University
>
>
> On Fri, Oct 25, 2013 at 11:19 AM, Robert LeBlanc
> <robert_leblanc at byu.edu> wrote:
> Here is the details of what I'm doing:
>
>
> In one terminal, I run ibsim:
> root at rleblanc-pc:/home/leblanc/Downloads# ibsim -s ibtopo
> parsing: ibtopo
> ibtopo: parsed 928 lines
> ########################
> Network simulator ready.
> MaxNetNodes = 2048
> MaxNetSwitches = 256
> MaxNetPorts = 13312
> MaxLinearCap = 30720
> MaxMcastCap = 1024
> sim> ibwarn: [2278] process_packet: no one to handle pkt:
> class 0x81, attr 0xff90
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> ...snip out tons of these messages...
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> clear "S-0002c90300684e30"
> sim> ibwarn: [2278] process_packet: got trap repress - drop
> ibwarn: [2278] process_packet: got trap repress - drop
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> ...snip out tons of these messages...
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> relink "0002c90300684e30"
> # nodeid "0002c90300684e30" (0002c90300684e30) not found
> sim> relink "S-0002c90300684e30"
> sim> ibwarn: [2278] process_packet: got trap repress - drop
> ibwarn: [2278] process_packet: got trap repress - drop
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> ...snip out tons of these messages...
> ibwarn: [2278] process_packet: no one to handle pkt: class
> 0x81, attr 0xff90
> quit
> Exiting network simulator.
> root at rleblanc-pc:/home/leblanc/Downloads#
>
>
> Then in another terminal I run opensm:
> root at rleblanc-pc:/home/leblanc/Documents/Work/Scripts/ib#
> SIM_HOST="H-0013970201000978" OSM_TMP_DIR=./ OSM_CACHE_DIR=./
> LD_PRELOAD=/usr/lib/umad2sim/libumad2sim.so opensm -e -v
> -f ./osm.log
> -------------------------------------------------
> OpenSM 3.3.15
> Command Line Arguments:
> Creating new log file
> Verbose option -v (log flags = 0x7)
> Log File: ./osm.log
> -------------------------------------------------
> OpenSM 3.3.15
>
>
> Entering DISCOVERING state
>
>
> Using default GUID 0x13970201000979
> Entering MASTER state
>
>
>
>
> =======================================================================================================
> Vendor : Ty : # : Sta : LID : LMC : MTU : LWA : LSA :
> Port GUID : Neighbor Port (Port #)
> Unknown : CA : 01 : ACT : 0003 : 0 : 2048 : 4x : 2.5 :
> f04da29097793001 : 0002c9020042ea60 (12)
> Unknown : CA : 02 : ACT : 0007 : 0 : 2048 : 4x : 2.5 :
> f04da29097793002 : 0002c902004294e0 (12)
> ------------------------------------------------------------------------------------------------------
> Mellanox : SW : 00 : : 0002 : 0 : : : :
> 0002c90300879a00 :
> Mellanox : SW : 01 : ACT : : : 2048 : 4x : 2.5 :
> 0002c90300879a00 : 0002c90200431f90 (08)
> Mellanox : SW : 02 : ACT : : : 2048 : 4x : 2.5 :
> 0002c90300879a00 : 0002c90200431f58 (09)
> Mellanox : SW : 03 : DWN : : : ??? : ??? : Ext :
> 0002c90300879a00 :
> ...snip...
>
>
> Then in a third console I run ibtracert:
> leblanc at rleblanc-pc:~/Documents/Work/Scripts/ib$
> LD_PRELOAD=/usr/lib/umad2sim/libumad2sim.so /usr/sbin/ibtracert -G -n 0x0002c90300ebbb62 0x0013970301001f4c 2> /dev/null
> From {0x0002c90300ebbb60}[2]
> [2] -> {0x0002c90300684e30}[19]
> [2] -> {0x0002c90200431eb8}[10]
> [33] -> {0x001397010a000044}[10]
> [35] -> {0x0013970301001f4c}[1]
> To {0x0013970301001f4b}[1]
> leblanc at rleblanc-pc:~/Documents/Work/Scripts/ib$
> LD_PRELOAD=/usr/lib/umad2sim/libumad2sim.so /usr/sbin/ibtracert -G -n 0x0002c90300ebbb62 0x0013970301001f4c 2> /dev/null
> /usr/sbin/ibtracert: iberror: failed: can't resolve source
> port 0x0002c90300ebbb62
> leblanc at rleblanc-pc:~/Documents/Work/Scripts/ib$
> LD_PRELOAD=/usr/lib/umad2sim/libumad2sim.so /usr/sbin/ibtracert -G -n 0x0002c90300ebbb62 0x0013970301001f4c 2> /dev/null
> From {0x0002c90300ebbb60}[2]
> [2] -> {0x0002c90300684e30}[19]
> [2] -> {0x0002c90200431eb8}[10]
> [33] -> {0x001397010a000044}[10]
> [35] -> {0x0013970301001f4c}[1]
> To {0x0013970301001f4b}[1]
> leblanc at rleblanc-pc:~/Documents/Work/Scripts/ib$
>
>
> I'm attaching our topo file that we are using and the opensm
> logs (you should be able to replicate the problem given this
> information or tell me what I'm doing wrong).
>
>
> Thanks,
>
>
>
> Robert LeBlanc
> OIT Infrastructure & Virtualization Engineer
> Brigham Young University
>
>
>
> On Tue, Oct 22, 2013 at 10:55 PM, Hal Rosenstock
> <hal.rosenstock at gmail.com> wrote:
> ibsim just simulates the network (topology, SMAs, and
> PMAs). OpenSM configured the subnet including the
> routing (LFTs and MFTs) based on the routing
> algorithm. It is possible in a topology that multiple
> routing algorithms yield the same routes. More
> specifics would be needed to comment "deeper"...
>
> -- Hal
>
>
> On Tue, Oct 22, 2013 at 6:38 PM, Robert LeBlanc
> <robert_leblanc at byu.edu> wrote:
>
> I'm trying to test routing in ibsim, but it
> doesn't seem to update the routing tables in
> the simulated switches. If I take a link down
> using the clear command in ibsim, I see opensm
> saying that it is updating the routing tables
> and that it completes, but I can't ibtracert
> to the LID who's path was taken down.
>
>
> Should ibsim and opensm be reconfiguring
> routing in the simulated environment? No
> matter which routing protocol I select in
> opensm, the routes are always the same, even
> having opensm re-LID the entire fabric doesn't
> help. Any help would be appreciated.
>
>
> Output from opensm:
>
>
> ******************************************************************
> ***** LID ASSIGNMENT COMPLETE - STARTING
> SWITCH TABLE CONFIG *****
> ******************************************************************
>
>
>
>
> Oct 22 16:27:20 330198 [8437A700] 0x04 ->
> osm_ucast_mgr_build_lid_matrices: Starting
> switches' Min Hop Table Assignment
> Oct 22 16:27:20 330954 [8437A700] 0x02 ->
> osm_ucast_mgr_process: minhop tables
> configured on all switches
> Oct 22 16:27:20 331191 [8437A700] 0x04 ->
> do_sweep:
>
>
>
>
> ******************************************************************
> **************** SWITCHES CONFIGURED FOR
> UNICAST *****************
> ******************************************************************
>
>
>
>
> Thanks,
>
>
> Robert LeBlanc
> OIT Infrastructure & Virtualization Engineer
> Brigham Young University
>
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
>
>
>
>
>
>
>
> _______________________________________________
> Users mailing list
> Users at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
--
Albert Chu
chu11 at llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
More information about the Users
mailing list