[ofa-general] opensm routing
Jeff Becker
Jeffrey.C.Becker at nasa.gov
Wed Jun 11 09:43:56 PDT 2008
Basically, we have an Altix ICE cluster connected by a pair of hypercube
Infiniband fabrics. External to that, we have some Lustre nodes
connected into the cluster with Infiniband. Our goal is to keep Lustre
traffic separate from compute (MPI) traffic. Ideally, we'd have 2
subnets and an IB router between the Lustre fabric and the compute
fabric to accomplish this.
Barring that, I thought we could use partitions as follows: compute
HCA's and switch ports are on both partitions with full membership in
compute partition, and limited membership in I/O partition. The Lustre
nodes and switches would only be in the I/O partition (full
membership). That way, inter compute node (MPI) traffic would be
disallowed from using routes through the I/O fabric (by partition
membership), and I/O traffic could not interfere with compute (via
separate partitions). Is this scheme feasible?
If that's not possible, the next idea is to modify OpenSM to assign
large weights to the links between the compute and I/O fabrics, so that
the MinHop algorithm would never consider using these links for
inter-compute node traffic.
Thoughts? Thanks.
-jeff
Al Chu wrote:
> Hey Jeff,
>
> Out of my curiosity, are you just trying to change the routing to
> improve job performance? i.e. lustre nodes get special routing vs.
> compute nodes?
>
> Al
>
> On Tue, 2008-06-10 at 15:08 -0700, Jeff Becker wrote:
>
>> Hi all. I was looking into doing some subnet partitioning to separate
>> compute nodes from Lustre nodes, and I saw the following in
>> ~sashak/management.git on the OFA server, in opensm/doc/OpenSM_PKey_Mgr.txt
>>
>> OpenSM Partition Management
>> ---------------------------
>>
>> Roadmap:
>> Phase 1 - provide partition management at the EndPort (HCA, Router and Switch
>> Port 0) level with no routing affects.
>> Phase 2 - routing engine should take partitions into account.
>> ...
>> Phase 2 functionality:
>>
>> The partition policy should be considered during the routing such that
>> links are associated with particular partition or a set of
>> partitions. Policy should be enhanced to provide hints for how to do
>> that (correlating to QoS too). The exact algorithm is TBD.
>>
>>
>> What is the status of Pkey-aware routing? Thanks.
>>
>> -jeff
>>
>> _______________________________________________
>> general mailing list
>> general at lists.openfabrics.org
>> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>>
>> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>>
More information about the general
mailing list