[openib-general] RE: [Openib-promoters] Need for ONE OpenIB Release process that allmembers can agree to and that follows OpenIB Bylaws

Weber, Bret Bret.Weber at engenio.com
Tue Feb 28 06:19:32 PST 2006


Bill, This is the first that I have heard of the desire for some to drop
SRP and iSER.
 
These are our storage protocols and the defined standard methods to run
SCSI over IB to native IB storage (or any other RDMA based wire).
 
Active testing is going on with feedback being given back to the
developers on these drivers.
 
Storage is a major part of the promise of IB, RDMA, and Unified wire.
 
We can't be looking at dropping it now.  It was late getting into the
gen 2 stack, but now that it is in, storage vendors are ramping up on
it.
 
Bret
 
 


________________________________

From: openib-promoters-bounces at openib.org
[mailto:openib-promoters-bounces at openib.org] On Behalf Of Bill Boas
Sent: Tuesday, February 28, 2006 12:49 AM
To: openib-promoters at openib.org; openib-general at openib.org
Cc: chetm at us.ibm.com
Subject: [Openib-promoters] Need for ONE OpenIB Release process that
allmembers can agree to and that follows OpenIB Bylaws



There appear to be 2 groups within OpenIB thinking about different
approaches to preparing the code for Release 1.0. One group is thinking
about downstreaming it to RedHat and Novell, another group seems to be
thinking about separate releases from some IB suppliers than others.

 

Lets remind ourselves of the purposes OpenIB was created and what all of
the member companies have just re-affirmed in the Board meeting last
Friday (by approving the re-worked By-laws). The principles are, I
believe: (if there are misstatements below, lets discus openly)

 

1)       OpenIB develops open source code creating a software stack.
OpenIB (now OpenFabrics Alliance) is a corporation with Bylaws that all
members should obey if they want the corporation to continue to
function. It will only survive if, in general, all members self
interests are served simultaneously with each member's own self
interest.

2)       OpenIB members by a 2/3rds vote of the members have to approve
the content of that stack through the Proposal process described section
12 of the Bylaws. It is not up to a single member or group of members to
decide on their own what is or is not in the OpenIB stack. This is
deliberate to prevent one or more members gaining competitive advantage
through the OpenIB stack over other members.

3)       OpenIB downstreams kernel code to kernel.org

4)       OpenIB code is distributed to end customers (like Wall St.,
labs, etc) and to mid tier customers of OpenIB (Oracle, IBM, Sun, Dell,
LNXI etc.) via Linux distributions such as RedHat and Novell.

5)       End customers told the IB companies in February 2004 and in
December 2005 at Credit Suisse (HSIR meeting) that they wanted ONE
OpenIB stack that runs on every IB vendors hardware, that interoperates
with all other IB vendors h/w and s/w, is used by all mid-tier suppliers
and that it comes with their Linux distribution.

 

I realize that so far in OpenIB's evolution we have not worked out the
issue of how to support end-customers while following these principles
for the release process. But that, I suggest, is not a valid reason for
breaking these principles. We should be able to deal with "Release" as
one process and "Support" as another process - though of course there
will be linkage between them but they are not the same process. The way
do ne is not necessarily the way to do the other.

 

This email is an appeal to the two groups to work together, not to work
separately, and to work on solving these issues for the membership as a
whole, not just their own company, or a select group. Please bring to
the Board a proposal that serves all the membership.

 

Here's what one group seems to be thinking (edited to remove "I"):

 

"Here is a first cut at the set of components (protocols, drivers,
userspace bits) that we think we should be supporting in 1.0.  Please
look over it and let us know if we are missing anything.

 

HCA support (both kernel driver and userspace verbs components):

 

      * ehca

      * ipath

      * mthca

 

IB protocols:

 

      * IPoIB

      * RC

      * SDP

      * SRQ

      * UC

      * UD

 

Userland software:

 

      * libibverbs

      * libsdp

      * opensm

 

As far as we can tell, most of the rest of OpenIB userland (libibcm,
libibat, libibmad, etc) is logically part of OpenSM, can be treated as
such (I think Doug is already doing this with his Red Hat spec files)
and is unlikely to be used by other applications.  Am I way off?

 

Components that we don't know what to do about, and will likely want to
drop unless someone can vouch for them:

 

      * iSER

      * SRP

      * uDAPL"

 

Here's what the other group suggested:

 

"Openib Commercial Grade Release 1.0 release criteria


1)       CPU Architectures: 

a)       x86_32 (Xeon) 

b)       x84_64 (Nocona, Opteron) 

c)       ia64

d)       PPC64 (Power5, Power6) - Mellanox does not support these
systems 

2)       Linux distributors and kernels 

a)       RH: AS EL4 up3; Fedora C4 last update , and maybe FC5

b)       SuSE: SuSE 10 last update (open - SLES10 beta)

c)       kernel.org: the latest that is available when generating rc1.
In 1.0 it will probably be 2.6.16 (might be 2.6.17).

3)       Packaging and installation

a)       The openib release will be packages in one tarball for both
kernel and user-level.

b)       One install script will support full installation. The install
will support typical and custom components

I will send a different document with install definition to be reviewed
and agreed between all.

4)       HCA and Switch Support:

a)       HCAs: InfiniHost, InfiniHost III Ex (both modes: with memory
and MemFree), InfiniHost III Lx 

b)       Switches: Need to support all vendors' production switches -
each vendor should send the list. 

5)       Switch Management Interoperability testing 

a)       Follow the CIWG-OpenIB HCA-OEM Switch Interop Test Plan

6)       Feature set per ULP: 

a)       Will be defined later with each ULP maintainer. 

7)       Minimum cluster size to be tested 

a)       Need at least 128 nodes cluster, bigger is better. 

8)       Scalability requirements 

a)       SM: 

i)         Bringup a subnet with 1,000 nodes in 2 minutes

ii)       SM should not be a bottle neck in any application running
(IPoIB)

b)       MPI:

i)         MPI runner - should be able to launch thousands of processes
(say 50,000) in a bounded time manner.

ii)       Memory consumption - should be able to run many processes on
the same node (for now, 8 processes is the upper limit with the Opteron
machines), in a many node (thousands of nodes) installation.

iii)      Sending HUGE messages in collectives - MPI should not fail for
limited physical memory. 

9)       Performance requirements:
First we need to agree on the performance benchmark for each ULP:

a)       Basic verbs - performance tests in openib (send, RDMA
read/write latency & BW)

b)       IPoIB - netperf

c)       MPI - Pallas

d)       SDP - iperf

e)       SRP - iometer

f)     iSER - iometer

10)   Documentation requirements 

a)       Product brief

b)       Installation guide 

c)       User guide 

d)       Release notes 

e)       Troubleshooting

f)     Test Plan and Test Report 

11)   Storage target test requirements 

a)       Engenio target - Mellanox will be responsible of verification

b)       Cisco & SST - please add more target systems 

12)   Firmware and Hardware versions to be tested 

a)       Both DDR and SDR modes should be supported. 

b)       FW burned should be the last official released by Mellanox:

i)         InfiniHost III Lx: fw-25204-1.0.800

ii)       InfiniHost III Ex: fw-25218-5.1.400 and fw-25208-4.7.600 (both
will be released in 2 weeks)

iii)      InfiniHost: fw-23108-3.4.000

iv)      InfiniScale III - fw-47396-0.8.3

v)        InfiniScale - fw-43132-5.5.0

13)   Specifications compliance: 

a)       Verbs & management: InfiniBand Architecture Specification,
Volume 1, Release 1.2 

b)       IPoIB: www.ietf.org: draft-ietf-ipoib-architecture-04 and
draft-ietf-ipoib-ip-over-infiniband-07 

c)       SDP: Annex A4" of the InfiniBand Architecture Specification,
Volume 1, Release 1.2 

d)        SRP: SCSI RDMA Protocol-2 (SRP-2), Doc. no. T10/1524-D.
(www.t10.org/ftp/t10/drafts/srp2/srp2r00a.pdf
<http://www.t10.org/ftp/t10/drafts/srp2/srp2r00a.pdf> ). 

e)       MPI: www.mpi-forum.org/docs/mpi-11-html/mpi-report.html
<http://www.mpi-forum.org/docs/mpi-11-html/mpi-report.html> 

f)         iSER:
www.ietf.org/internet-drafts/draft-hufferd-iser-ib-01.pdf
<http://www.ietf.org/internet-drafts/draft-hufferd-iser-ib-01.pdf>  

g)       RDS: SS can you provide info 

The following two items are very important for the SW stack QA but not
gating for starting the release process.

1)       ISV test requirements - coverage for all ULPs

2)      Database test requirements

Cisco, SS and Voltaire should define those, since they already have test
beds for commercial applications and databases."

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20060228/59efec52/attachment.html>


More information about the general mailing list