[ofa-general] opensm routing

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Tue Jun 17 02:10:03 PDT 2008


Hal Rosenstock wrote:
> On Mon, 2008-06-16 at 09:58 -0700, Ira Weiny wrote:
>> On Mon, 16 Jun 2008 09:47:51 -0700
>> Hal Rosenstock <hrosenstock at xsigo.com> wrote:
>>
>>> On Mon, 2008-06-16 at 09:46 -0700, Ira Weiny wrote:
>>>> On Mon, 16 Jun 2008 09:38:58 -0700
>>>> Hal Rosenstock <hrosenstock at xsigo.com> wrote:
>>>>
>>>>> On Mon, 2008-06-16 at 09:35 -0700, Ira Weiny wrote:
>>>>>> On Mon, 16 Jun 2008 09:25:48 -0700
>>>>>> Hal Rosenstock <hrosenstock at xsigo.com> wrote:
>>>>>>
>>>>>>> On Mon, 2008-06-16 at 09:16 -0700, Al Chu wrote:
>>>>>>>> I asked the Lustre people in my hallway, and it isn't
>>>>>>>> currently configurable for Lustre. 
>>>>>>> Wouldn't Lustre SL be inherited from partition based on underlying IPoIB
>>>>>>> interface ?
>>>>>> I am not quite sure what you mean here?  Our Lustre sets up their own QP's via
>>>>>> the RDMACM.  So I believe we could set our SL and/or partition for those QP's
>>>>>> separately from IPoIB via a modify_qp call; right?
>>>>> RDMA CM does address resolution based on IP addresses and an SL can be
>>>>> associated with the outgoing IPoIB interface.
>>>>>
>>>> Right, but does it _have_ to be the associated?  I thought not.
>>> Do you want a different SL from that ?
>>>
>> Maybe, some MPI's may use the RDMACM as well (I think some already do).
>> Therefore if you want Lustre and MPI to be on different SL's at least one of
>> them will have to change from the "inherited" IPoIB SL.
> 
> Just use multiple (per ULP ?) IPoIB interfaces on different partitions
> with different SLs.

AFAIK latest MVAPICH and OpenMPI have a command line option that
specifies which SL to use, so you can have MPI running on non-default
SL, leaving Lustre intact on a default SL.

Besides, if Lustre sends path queries to OpenSM (via RDMACM) when
opening their own QPs, then Lustre can use other SLs too.
SM can even assign different SLs to communication to Lustre metadata
servers and object storage servers if they are running on separate
hosts, all based on the path query.

-- Yevgeny


> -- Hal
> 
>>> There's a QoS syntax but I'm not
>>> sure how Lustre plays into that.
>>  
>> Just to be clear this is only a "thought experiment" at this point.  We have
>> not tried to do any of this for real, yet.  ;-)  We realized there might be
>> many changes to various configurations and codes which may need to be done.
>> But knowing that I/O is less dependent on latency than MPI it seems to follow
>> that overall system performance could benefit from having MPI run at a higher
>> priority than Lustre/NFS etc.
>>
>> Ira
>>
> 
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 




More information about the general mailing list