[ofw] [RFC] Remove path query from IPoIB

Fab Tillier ftillier at windows.microsoft.com
Wed Aug 6 08:39:48 PDT 2008


Hi Hal,

Thanks for taking a look and responding.

> Hi Fab,
>
> On Tue, Aug 5, 2008 at 6:25 PM, Fab Tillier
> <ftillier at windows.microsoft.com> wrote:
>> I wanted to get this out for comments before I complete testing.
>> This change removes path queries from unicast traffic in IPoIB.  It
>> makes use of information from receive work completions (SLID, SGID) and
>> the broadcast group (SL, flow label, hop limit, traffic class, static
>> rate) to form the address vectors.
>
> I don't think it's required that the flow label, hop limit, and
> traffic class are the same for unicast and broadcast traffic. (I
> forget about whether the same is true for SL). Also, static rate might
> be pessimistic for unicast.

Can the flow label, hop limit, traffic class, static rate, and service level from the MC group be wrong for unicast traffic, versus just submoptimal?  I'm OK with things not being optimal as long as they're not broken, because in the common case today these settings are the same for unicast and multicast (at least from my investigation into OpenSM.)

The theory here is that if the MC group can reach everyone in the broadcast group, these values must be at least good enough to reach everyone via unicast packets.  Is this incorrect?

I have issues today running MPI jobs in 64-node clusters because the SM can't keep up.  I don't think any tweaks to the SM can give me an order of magnitude better performance.  I tried exponential back-off for SA queries, but that just moved the problem up the stack to the ARP requests timing out because the ARP responses were waiting on the SM for a path.  Things only get worse as the number of cores in a server increases.

-Fab



More information about the ofw mailing list