[ofa-general] opensm routing

Jeff Becker Jeffrey.C.Becker at nasa.gov
Wed Jun 11 09:43:56 PDT 2008


Basically, we have an Altix ICE cluster connected by a pair of hypercube 
Infiniband fabrics. External to that, we have some Lustre nodes 
connected into the cluster with Infiniband. Our goal is to keep Lustre 
traffic separate from compute (MPI) traffic. Ideally, we'd have 2 
subnets and an IB router between the Lustre fabric and the compute 
fabric to accomplish this.

Barring that, I thought we could use partitions as follows: compute 
HCA's and switch ports are on both partitions with full membership in 
compute partition, and limited membership in I/O partition.  The Lustre 
nodes and switches would only be in the I/O partition  (full 
membership). That way, inter compute node (MPI) traffic would be 
disallowed from using routes through the I/O fabric (by partition 
membership), and I/O traffic could not interfere with compute (via 
separate partitions). Is this scheme feasible?

If that's not possible, the next idea is to modify OpenSM to assign 
large weights to the links between the compute and I/O fabrics, so that 
the MinHop algorithm would never consider using these links for 
inter-compute node traffic.

Thoughts? Thanks.

-jeff

Al Chu wrote:
> Hey Jeff,
>
> Out of my curiosity, are you just trying to change the routing to
> improve job performance?  i.e. lustre nodes get special routing vs.
> compute nodes?
>
> Al
>
> On Tue, 2008-06-10 at 15:08 -0700, Jeff Becker wrote:
>   
>> Hi all. I was looking into doing some subnet partitioning to separate 
>> compute nodes from Lustre nodes, and I saw the following in 
>> ~sashak/management.git on the OFA server, in opensm/doc/OpenSM_PKey_Mgr.txt
>>
>> OpenSM Partition Management
>> ---------------------------
>>
>> Roadmap:
>> Phase 1 - provide partition management at the EndPort (HCA, Router and Switch
>>           Port 0) level with no routing affects.
>> Phase 2 - routing engine should take partitions into account.
>> ...
>> Phase 2 functionality:
>>
>> The partition policy should be considered during the routing such that
>> links are associated with particular partition or a set of
>> partitions. Policy should be enhanced to provide hints for how to do
>> that (correlating to QoS too). The exact algorithm is TBD.
>>
>>
>> What is the status of Pkey-aware routing? Thanks.
>>
>> -jeff
>>
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>>     




More information about the general mailing list