[ofw] WWG meeting Tuesday 6/1 minutes

Smith, Stan stan.smith at intel.com
Wed Jun 2 09:57:00 PDT 2010


Attending:

Microsoft: Fab, Eric & Ganesh
Mellanox: Tzachi & Uri
Intel: Sean, Stan
Qlogic: Sophia & John

OFED for XP (32-bit) support will be transitioned to a maintenance mode.
Maintenance mode implies bugs are still fixed although new OFED features will not be supported on XP/32.

OFED for XP (64-bit) aka Server 2003, remains supported as RoCEE could be supported on Server 2003.

RoCEE on Windows work will resume in August, with preliminary patches (those reviewed prior to WinOF 2.2 release) to be submitted to the OFW email list for review and ASAP svn commits; cures the binary incompatibility between Mellanox releases and WinOF 2.2 release.
The remainder of RoCEE work is targeted for beta testing and review in the Nov/Dec 2010 time frame.
Small patch sets were requested to be submitted to the OFW list for review prior to a massive svn commit for RoCEE.

OpenSM is transitioning from version 3.3.3 (WinOF 2.2 release) to version 3.3.6.
Tzachi reported a memory leak in 3.3.3. A similar memory leak was fixed in the 3.3.5 code base.
Stan will publish/commit an opensm changelog reflecting the changes/bugs-fixed in opensm 3.3.6.
Stan spoke of IPoIB installations problems with compute nodes not receiving DHCP addresses from the head node.
Fab indicated for those HPC Edition cases where DHCP and opensm are both running on the head-node and IPoIB on the head-node is a static IPv4 address with compute nodes being DHCP assigned, that DHCP must be restarted along with 'ipconfig release & ipconfig /renew' performed on the head-node in order for DHCP to start servicing DHCP requests for compute nodes on the IPoIB network.
Another approach is to delay DHCP startup until OpenSM is running in order for the DHCP server to 'see' an operational IPoIB NIC in order to service DHCP requests from the IPoIB network. Otherwise, DHCP sees a non-functional IPoIB NIC and excludes the NIC from DHCP processing.

Fab and Tzachi both noted DHCP compute node address assignment failures with opensm 3.3.3 with larger (50+) node counts. Compute nodes fail to get a DHCP address assigned due to opensm not being able to keep up?
Stan will test this for opensm 3.3.6.

NetworkDirect (ND) version 2.0 provider will target Server 2008 using winverbs as the verbs provider; ND V1.0 will remain on IBAL & winverbs.
Fab is developing the ND V2 provider as a single .dll which can run side-by-side with a ND V1 provider (IBAL or winverbs).
Alpha ND V2 provider release is targeted for mid-July svn commit; pending other MS work impacts.
The question was posed as to why not a single (.dll) ND provider which could do V1 & V2?
Sean pointed out his experience in writing a skeletal ND V2 provider was the code reuse from ND V1 to ND V2 was minimal.
Discussion moved to the OFA list if needed.

QLogic HCA delivery is targeted at Q4 2010 pending resolution of OFED stack architectural issues; OFED stack has been a single HCA vendor show up until now, some assumptions are no longer valid in the multi HCA vendor arena; weeding these assumptions out has been time consuming.
Qlogic HCA driver is passing some WHQL tests with the primary focus on IPoIB, MPI and ND usage models.
Qlogic has expressed dismay in the lack of OFA list response to questions posed.
Stan will explore the use of the Microsoft defined InfiniBand Class GUID instead of current .inf file defined GUID.
Qlogic experiments with changing the GUID have demonstrated a failure to load ibbus class filter driver.
Qlogic also requested source code identification of Vendor specific features.
Qlogic will supply patches to remedy those Vendor specific code segments which should be vendor neutral; UD packet send with Address vector creation for each send; AV reuse?

Stan will send a feature list for the next OFED for Windows release.
Perennial question was asked, "Do we hold up an OFED-W release until all desired features are present or de-feature and release?".
Qlogic stated they are prepared to make their own release in order to meet customer expectations if OFED-W is waiting on features.

Eric Lantz from Microsoft asked which diagnostic interfaces are supported such that HPC integration with OFED IB diag tools can seamlessly occur.
OFED IB diagnostic tools were suggested as that's where the most non-vendor aligned development energy is being focused.
Eric suggested the topic of diagnostic interfaces be placed on the next WWG agenda.









More information about the ofw mailing list