[ofa-general] opensm routing

Yevgeny Kliteynik kliteyn at dev.mellanox.co.il
Mon Jun 16 07:32:42 PDT 2008


Jeff,

Jeff Becker wrote:
> Hi Al
> 
> Al Chu wrote:
>> Hey Jeff,
>>
>>  
>>> That works. The compute nodes need to talk to other compute nodes for 
>>> MPI over one set of links, and they need to talk to the Lustre nodes 
>>> for I/O, but over a different (disjoint) set of links. Thanks.
>>>     
>>
>> Is there a strong belief that a different/disjoint set of links would be
>> beneficial?  Sometime ago, Sasha and I iterated on a patch in which I
>> found out sometimes not all switch ports would be used.  In this
>> particular case, a chunk of leaf switches were sometimes using only 11
>> out of 12 uplinks.  After the fix, mpigraph showed about 20% improvement
>> in MPI bandwidth.
>>   
> Basically, we want to avoid situations where I/O and MPI contend for the 
> same links, and get in each other's way.

What about using different VLs for MPI and I/O?
It won't buy more bandwidth, but it might prevent MPI and I/O from
congesting each other - they will share the wire according to the
priority that you will define.

-- Yevgeny

> -jeff



More information about the general mailing list