[Users] Subnet question

Robert LeBlanc robert_leblanc at byu.edu
Tue Oct 8 10:38:29 PDT 2013


Kevin,

Thanks for the input, I'll look into it. Does scatter ports work with
MinHop, of do I need to use UpDown (the only time I've seen Scatter Ports
mentioned
http://www2.cisl.ucar.edu/sites/default/files/Mizero,%20F_SIParCS2013.pdf).

Thanks,


Robert LeBlanc
OIT Infrastructure & Virtualization Engineer
Brigham Young University


On Tue, Oct 8, 2013 at 11:35 AM, Kevin Harms <harms at alcf.anl.gov> wrote:

>
>   This is not a full solution to your problem, but you can add/set the
> value
>   scatter_ports 37
>   in your opensm.conf. This generally improves the distribution of the
> paths over available ports. It will not guarantee that all routes to N
> don't flow through a given switch but it is much more likely not to occur.
> The value of 37 is the initial seed for the algorithm. In our case, we did
> this to improve performance.
>
> kevin
>
> On Oct 8, 2013, at 12:04 PM, Robert LeBlanc <robert_leblanc at byu.edu>
> wrote:
>
> > We have been running Oracle OVN (previously Xsigo) in our data center
> > environment for two years now. This last week we upgraded the firmware on
> > our Mellanox IS5030 switches to see if that would help resolve some
> > communication issues for Oracle PVI and IPoIB. The Oracle OVN creates
> > virtual NICs and virtual HBAs and encapsulates the traffic, sends it over
> > the Infiniband fabric and to the directors where the data is
> unencapsulated
> > and sent out on the traditional Ethernet and Fibre Channel networks.
> >
> > During our upgrade, it seems that it so happened that all four of our
> vHBAs
> > were routed through the same IS5030 switch causing all of the storage for
> > some of our ESX hosts to disappear when the switch was rebooted. This
> > caused an APD state and some of the VMs suffered corruption. We are now
> > looking for ways that we can make sure the routing tries to evenly spread
> > out all of the routes between available paths to help reduce/prevent this
> > in the future. We would want an algorithm that focuses on availability
> and
> > is part of the standard OFED openSM. We are looking for stability over
> > cutting edge. We are currently using MinHop for the routing algorithm.
> I'm
> > attaching a diagram (
> >
> https://docs.google.com/drawings/d/18pMOpiM7Bz2kaiyI0NOzB1q5o-0Zcy-9E1hJYNm6uNg/edit?usp=sharing
> )
> > of our environment as I know different algorithms are tailored for
> > different environments.
> >
> > We are also looking to try to extract our topology, load it into IBMgtSim
> > and run simulations on MinHop and other algorithms to see what the
> > probability of having all paths run through one switch are. If you have
> any
> > pointer, we would be glad to accept them. One difficulty is that when I
> do
> > ibnetdiscover, it is showing me the ports of the HCAs, but not the node
> > GUID of the card. I suppose that if I see CA, I can subtract the port
> > number from the port GUID to get the host GUID, would that be a safe
> > assumption.
> >
> >> ibnetdiscover
> > CA    56  2 0xf04da2909778e716 4x QDR - SW    52 18 0x0002c90200448e28 (
> > 'MT25408 ConnectX Mellanox Technologies' - 'Infiniscale-IV Mellanox
> > Technologies' )
> > CA    55  1 0xf04da2909778e715 4x QDR - SW    51 18 0x0002c90200448ec8 (
> > 'MT25408 ConnectX Mellanox Technologies' - 'Infiniscale-IV Mellanox
> > Technologies' )
> > ...snip...
> >
> >> ibhosts
> > Ca      : 0xf04da2909778e714 ports 2 "MT25408 ConnectX Mellanox
> > Technologies"
> > ...snip...
> >
> >
> > Thank you in advance for reading and helping us.
> >
> > Robert LeBlanc
> > OIT Infrastructure & Virtualization Engineer
> > Brigham Young University
> > <Fabric Design Public.pdf>_______________________________________________
> > Users mailing list
> > Users at lists.openfabrics.org
> > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20131008/e2fcc58f/attachment.html>


More information about the Users mailing list