[openfabrics-ewg] OFED 1.1-rc2 is ready

Doug Ledford dledford at redhat.com
Fri Aug 25 10:29:54 PDT 2006


On Mon, 2006-08-21 at 21:49 +0300, Tziporet Koren wrote:
> Hi,
> 
> OFED 1.1-RC2 is avilable on https://openib.org/svn/gen2/branches/1.1/ofed/releases/
> File: OFED-1.1-rc2.tgz
> Please report any issues in bugzilla http://openib.org/bugzilla/
> 
> Tziporet & Vlad
> -------------------------------------------------------------------------------------
> 
> Release details:
> ================
> 
> Build_id:
> OFED-1.1-rc2
> 
> openib-1.1 (REV=9037)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
> Git:
> ref: refs/heads/ofed_1_1
> commit a13195d7ca0f047f479a58b2a81ff2b796eb8fa4
> 
> # MPI
> mpi_osu-0.9.7-mlx2.2.0.tgz
> openmpi-1.1-1.src.rpm
> mpitests-2.0-0.src.rpm
> 
> 
> OS support:
> ===========
> Novell:
> 	- SLES 9.0 SP3*
> 	- SLES10 (official release)*
> Redhat:
> 	- Redhat EL4 up3
> 	- Redhat EL4 up4* (not supported yet)
> kernel.org:
> 	- Kernel 2.6.17*
> * Changed from 1.0 release
> 
> Note: Redhat EL4 up2, Fedora C4 and SuSE Pro 10 were dropped from the list.
> We keep the backport patches for these OSes and make sure OFED compile and 
> loaded properly but will not do full QA cycle.
> 
> Systems:
> ========
>     * x86_64
>     * x86
>     * ia64
>     * ppc64

Not supporting ppc is a problem to a certain extent.  I can't speak for
SuSE, but at least for Red Hat, ppc is the default and over rides ppc64.
The ppc64 arch is less efficient than the ppc arch on ppc64 processors
except when large memory footprints are involved.  So, for things like
opensm, ibv_*, etc. the ppc arch should actually be preferred, and the
ppc64 arch libs should be present for those end user apps that need
large memory access.  That fact that dapl doesn't compile on ppc at all
is problematic as well.  In addition, what are you guys doing about the
lack of asm/atomic.h (breaking udapl compiles on ppc64 and ia64) going
forward?  I'd look in the packages and see for myself but the svn update
is taking forever due to those binary rpms packed into svn...ahh, it's
finally done....ok, still broken.

Without getting into an argument over the usage of that include, suffice
it to say that the include file is gone and builds fails on
fc6/rhel5beta.  Since the code really only uses low level intrinsics as
opposed to high level atomic ops, I made a ppc and ia64 intrinsics
header for linux and added it to the dapl package itself to work around
the issue.

> 
> Main changes from OFED-1.1-rc1:
> ===============================
> 1. ipath driver: 
> 	- Compilation pass on all systems, except SLES9 SP3.
> 	- See list of changes in the ipath driver at the end
> 2. SDP: 
> 	- Fixed issue with 32 bit systems run out of low memory when opening hundreds of sockets.
> 	- Added out of band and message peek support; telnet and ftp are now working
> 3. SRP - a new srp_daemon was added - see explanation at the end
> 4. IPoIB: High availability support using a daemon in user level. 
>    Daemon is located under /userspace/ipoibtools/. See explanation at the end.
> 5. Added Madeye utility
> 6. Added verbs fork support. Should work from kernel 2.6.16
> 7. Fatal error support in mthca
> 8. iSER support in install script for SLES 10 was fixed
> 9. Diagnostic tools does not requires opensm installation.
>    For this the following changes were done to opensm RPM: 
>       opensm-devel was removed
>    New packages were added:
>       libosmcomp
>       libosmcomp-devel
>       libosmvendor
>       libosmvendor-devel
>       libopensm
>       libopensm-devel

Ugh.  Each library does not need it's own package.  Imagine what X would
do to your RPM count otherwise.  For grouped libraries like this, it is
perfectly acceptable to do opensm, opensm-libs, opensm-devel (and that's
in fact what I did for RHEL4 U4).  Regardless though, make a decision
and stick to it.  Changing package names with each release == not good.

> 10. bug fixes:
>    - SRP: Add local_ib_device/local_ib_port attributes to srp scsi_host
>    - mthca: fence bit supported; fixed deadlock in destroy qp
>    - ipoib: connectivity lost on sm lid change
>    - OSM: fix to work with Cisco stack
> 
> 
> Limitations and known issues:
> =============================
> 1. SDP: For Mellanox Sinai HCAs one must use latest FW version (1.1.000).
> 2. SDP: Get peer name is not working properly
> 3. SDP: Scalability issue when many connections are opened
> 4. ipath driver does not compile on SLES9 SP3
> 5. RHEL4 up4 is not supported due to problems in the backport patches.

You should be able to start by pulling the patches that are already
applied out of the RHEL4 U4 kernel rpm, looking at which ones fix up the
core kernel to provide what's needed instead of doing a thousand little
backports all over the kernel tree, and axing any backport patches you
had planned that would undo that.  IOW, make use of the infrastructure
provided in U4 instead of working around it.

> 
> Missing features that should be completed for RC3:
> ==================================================
> 1. Core: Huge pages fix
> 2. IPoIB high availability does not support multicast groups
> 3. Support RHEL4 up4
> 
> Changes in the ipath driver:
> ============================
>       * lock resource limit counters correctly
>       * fix for crash on module unload, if cfgports < portcnt
>       * fix handling of kpiobufs
>       * drop requirement that PIO buffers be mmaped write-only
>       * merge ipath_core and ib_ipath drivers
>       * simplify layering code
>       * simplify debugging code after ipath_core and ib_ipath merger
>       * remove stale references to userspace SMA
>       * More changes to support InfiniPath on PowerPC 970 systems.
>       * add new minor device to allow sending of diag packets
>       * do not allow use of CQ entries with invalid counts
>       * account for attached QPs correctly
>       * support new QLogic product naming scheme
>       * add serial number to hardware freeze error message
>       * be more strict about testing the modify QP verb
>       * validate path_mig_state properly
>       * put a limit on the number of QPs that can be created
>       * handle sq_sig_all field correctly
>       * allow SMA to be disabled
>       * fix return value from ipath_poll
>       * print warning if LID not acquired within one minute
>       * allow direct control of Rx polarity inversion
> 
> srp_daemon explanation:
> =======================
> srp_daemon is a tool that identifies SRP targets in the fabric. 
> 
> Each srp_daemon instance is operating on one port. 
> On boot it performs a full rescan of the fabric and waits to srp_daemon events:
> - a join of a new target to the fabric
> - a change in the capabilities of a machine that becomes a target
> - an SA change
> - an expiration of a predefined timeout
> 
> When there is an SA change or a timeout expiration srp_daemon perform a full rescan of the fabric.
> 
> for each target srp_daemon finds, it checks if it is already connected to that port, if it is not connected, srp_daemon can either print the target details or connect to it.
> 
> Run srp_daemon -h for usage.
> 
> 
> IPoIB HA daemon:
> ================
> The IPoIB HA daemon can be configured in /etc/infiniband/openib.conf file:
>  
> # Enable IPoIB High Availability daemon
> IPOIBHA_ENABLE=yes
> # PRIMARY_IPOIB_DEV=ib0
> # BACKUP_IPOIB_DEV=ib1
>  
> The default for PRIMARY_IPOIB_DEV is ib0 and for BACKUP_IPOIB_DEV is ib1.

Now that my svn update is complete, I'll review the 1.1rc2 spec and
install files and then send a separate email to the list about the
various things I had to change in the 1.0 release to meet packaging
guidelines relevant to the Fedora/Red Hat package review process that
still apply to 1.1.

-- 
Doug Ledford <dledford at redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.openfabrics.org/pipermail/ewg/attachments/20060825/643962a3/attachment.sig>


More information about the ewg mailing list