[ewg] RE: Agenda for the OFED meeting today (Jan 5, 09)
Davis, Arlin R
arlin.r.davis at intel.com
Mon Jan 5 11:01:22 PST 2009
There are scaling issues with SA path-record queries. We attempted to be good citizens with Intel MPI using the rdma_cm agent (via uDAPL) but was forced to build hard-coded RC QP support in OFED 1.4 (uDAPL scm) to avoid the many scaling and configuration problems that came with IPoIB requirements, ARP storms, rdma_cm timers, and SA path record query/caching.
If someone wants to sign up to design and implement a scalable SA query caching agent we would be happy to look at path record queries again.
Another suggestion for 1.5
Implementation of SA queries for Path Records (using IBTA 1.2.1 ServiceId field) in all OFED ULPs, especially for MPI
The IBTA standard defines that the proper way to establish a connection is to get a PathRecord from the SM/SA and use it to define all the attributes of the communication path.
Ideally the IBTA CM should then be used to establish the connection and QPs as well.
At present, openmpi, mvapich1 and mvapich2 do not use PathRecords, but instead hard code attributes like the PKey, SL, etc.
In some cases these hardcoded values can be overridden by configurable values such as PKey and SL, but such values must be uniform across all connections and must be provided per job (which can be error prone/tedious).
At present opensm supports PKeys and SLs, however MPI cannot easily use these features.
Other features, such as lash routing, in opensm do not work properly with MPI because the SL must be uniform across all connections, but for lash it will vary per route.
Additionally, applications which do not use PathRecords will have difficulties with advanced features like IB routing, partitioning, etc. All of which are available or being worked on in opensm.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the ewg