[ofiwg] [libfabric-users] feature requests

Oucharek, Doug S doug.s.oucharek at intel.com
Tue Jun 6 17:04:44 PDT 2017


I should make another clarification:

LNet only uses addresses to establish routing from the session manager.  The address lists I keep referring to are lists of established endpoints (we use RC - reliable connections) and not the actual addresses.  So, when the application (Lustre) refers to an address, we map that to a list of established endpoints (each could be to a different provider) and round robin sending messages over that set of endpoints.  We have “credits” on each endpoint so we can do weighted round robin.

Doug

> On Jun 6, 2017, at 4:57 PM, Hefty, Sean <sean.hefty at intel.com> wrote:
> 
>> We just recently added Multi-Rail support to LNet (Lustre networking
>> layer).  We designate a “primary address” which is mapped to a list of
>> addresses for Multi-Rail.  The application only uses the primary
>> address to send/receive from the peer.
> 
> How did you handle the full address exchange?  Does a failure to the primary address result in any issues?



More information about the ofiwg mailing list