PathForward Project Teleconference 5/4/05 Attendees: Matt, Roland, Sean, Bill, Woody, Hal Agenda Bashing Administrivia OLS paper almost complete Next meeting 5/11 AR Review Kernel update nothing upstream yet 2.6.12 taking time to stabilize memory pinning still under discussion converged (except for Caitlin) implementing to that SuperComputing 05 2 activities proposed for November infrastructure for SCInet (IB interconnect) customers rather than vendor managed optical technology issue (4x to 4 parallel optic dongle) switches provide power (unused ground now need to be driven by switches) Some Topspin switches and Mellanox HCAs support this (software transparent) Tutorial for customers virtual networks Xen currently not supported not started on started investigating architecture substantial risk target end of next week strawman to openib and xen lists November tight plan to do without virtualization 2.6.9 patch svn 2245 building user mode verbs works, IPoIB one SDP issue still IA64 issue (firmware ?) EM64T works 2.6.9 backports whenever OpenIB goes upstream missing some get_sb_psuedo so need custom kernel IPoIB needs this that could be pulled so user veerbs as binary could be made to work but not IPoIB add patches to source and build new kernel RPM ? patch based on 2.6.12-mm3 split into kernel, base drivers, fixup need to be put into subversion somewhere waiting for 2.6.12 to be released latest code to 2.6.9 user mode support IA64 issue ? SDP also RedFlag request RedHat also 2.6.9 based (RHEL4) would they pick this up ? TriLabs will be using RHEL4 latest non SLES 2.6.11 RedHat Fedora support issue for this ? leave to distro MPI nothing new to report here No news from Tim DK released MVAPICH 0.9.5 lastweek starting to look at gen2 Tim (LANL) 1-2 months OpenMPI on OpenIB RC with Steve Poole Matt started on LA MPI (UD) Intel MPI over uDAPL is running Status of Current Tasks Sean: RMPP responding to emails and retesting already changed API to remove userspace parameter working with InfiniCon on CM (Bill Jordan) REJ on bad request Libor: user CM Arlin not yet started on this posted last week with simple example working on more sophisticated example uses IB verbs need to get path record Roland: user verbs changes to memory pinning being implemented kernel patch for DONT COPY flag on mprotect/mmap (fork support) reference counting of overlapping regions ultimate ALWAYS COPY for partial pages is this usable for 2.6.9 ? RedHat can't change kernel ABI does this do that ? change symbol versions IA 64 details general cleanup remaining: large number of minor pieces attach/detach mc QPs, some query verbs (GID table) fair bit of loose ends still to do documentation 2 more weeks for initial release ? tar balls and RPMs Shahar/Hal: Worked on user_mad.c, libibumad, and OpenSM to support RMPP on the send side Next week work on receive side bad port handling algorithm finally converged needs be implemented OpenSM initialization times RARP Solaris snoop real fix still pending cluster diag testing completed by Josh Bugzilla (overhead) revisit later (~ 1 month) Should Mellanox be part of this call ? at stage 2 when have a piece of PF OpenSM ? verification ? other ? Other/Associated Project Issues Work split between Roland and Libor additional features (priority order) user space events (port state, LID/SMLID changes) user space snoop MAD IPoIB partial connectivity osmtest to be supported ? Plugfest/IL list may be possible in April 05 as DTA is likely to be dropped as requirement vote this week at CIWG SM client reregister/IPoIB reregistration for multicast groups EUI-64 and IPoIB HCA capability mask