[openfabrics-ewg] OpenMPI package

Tziporet Koren tziporet at mellanox.co.il
Mon Mar 27 07:30:25 PST 2006


Hi Jeff,
See answers inside starting with TK.

Tziporet & Vlad

-----Original Message-----
From: openfabrics-ewg-bounces at openib.org
[mailto:openfabrics-ewg-bounces at openib.org] On Behalf Of Jeff Squyres
(jsquyres)
Sent: Thursday, March 23, 2006 8:59 PM
To: openfabrics-ewg at openib.org
Subject: RE: [openfabrics-ewg] OpenMPI package

Is there a repository we can work with?  If the logistics can be worked
out, this might be a *lot* easier than wholesale mailing of code around
(that's what version control systems are for).  I would strongly prefer
this model; it's just better software engineering.

TK: You need to decide regarding repository. Best option if you can put
the code somewhere in OpenFabrics svn (maybe under contrib.) Do you see
any issues with this?

Later Vlad will create a script that generates IBED package from all
components. 

> - I see a bunch of top-level *.sh scripts -- these will need to be
> expanded for Open MPI.  Who does that -- me or you?
> 
> <Aviram> Can you do it and send it over?

This will be tremendously simpler with a shared repository -- to avoid
conflicts, allow proper merging, etc.  (e.g., what if someone else is
working on the same code at the same time?)

TK: regarding repository - see answer above. 
The requirements from OpenMPI are spec files to create RMPs.
Need to know what is the options these spec file supports (e.g.
compiler, configuration options, etc.)

If code is not in repository - then we need to get a tarball, or source
RPMs.

> - I see SOURCES/mpi_osu...tgz.  That contains a bunch of 
> scripts and the
> mvapich tarball.  I'm assuming that Open MPI needs to be bundled this
> way as well...?  Is this documented anywhere?  Is there a reason why a
> tarball contains another tarball?
> 
> - The mpi_osu...tgz file contains several MPI-independent utilities
> (PMB, prestal, etc.).  Should these be moved out of the OSU 
> MPI tarball
> and into an MPI-agnostic tarball, and then compiled against each MPI
> that is installed?
> 

Ok.  I need these answers before I give you an Open MPI package to
integrate.

Specifically: you need more than just an RPM and/or SRPM.  It's not
entirely clear to me yet what exactly that is, but there does appear to
be a bunch of scripts and other things that are necessary.

> - What is the plan for having 2 MPIs in this distribution -- how will
> users/sysadmins choose between them?  I.e., are we going to allow both
> to be installed and make it a user-level decision which to 
> use?  Or will
> the sysadmin only pick one to install?  Or ...?
> 
> <Aviram> We need to decide on that. All, how do you view it?

Someone else replied that we should let the sysadmin choose to install
one.

I think that this is a tremendously complex issue; cluster-installation
packages (OSCAR, Rocks, Warewulf, etc.) have spent a great deal of time
wrestling with this issue over the years.  This is probably worth some
time on the next teleconference (is there a next one scheduled?).

In the *best* case scenario, there will only be one MPI installed.  I am
not familiar with everyone's customer base, but I have seen clusters
with upwards of 30 MPI implementations installed (i.e., including a
large number of variations of the same implementation -- e.g., Open MPI
compiled against different compilers -- the issue is the same).  This
brings up all kinds of practical and logistical derrivative issues.

TK: The current install enable several MPI to be installed on the same
machine, and even the same MPI compiled with different compilers.
I see you got a point regarding cluster administration. I suggest that
OpenMPI and OSU MPI maintainers will meet and decide what they want. 
Then we can update the install script accordingly.
Do you need my facilitation for such a meeting? 


> 2. Some more questions that I did not include in my mail last night:
> 
> - Is there a source code repository for IBED somewhere?  What is the
> model for developers to modify / test IBED?

TK: IBED RC2 is a tarball that Mellanox created. It is based on code
from trunk of openib, RDS from SilverStorm contrib and OSUMPI version
0.95 + Mellanox patches. All install scripts are under contrib/mellanox.

After we close all source handling subject we will have a script that
will be able to build IBED package out of SVN repository of all
components.

> 
> - What version of MVAPICH is being used?  I see
> mvapich-0.9.5-mlx2.0.1.tar.gz -- does this mean it's Mellanox's v2.0.1
> of MVAPICH 0.9.5?  Are other vendors allowed to modify this?  (I ask
> because all of our MVAPICH's are slightly different -- fixed bugs
> specific to our customers, etc.)
> 
> <Aviram> We'll use 0.9.7 It will be incorporated on the next 
> rc. Yes we
> can fix and modify it.

What will be the model for vendors other than Mellanox to collaborate
and contribute?

TK: Need to work on this with Pasha (Pavel Shamis). My suggestion is to
have patches directory under MPI (Since Dr. Panda is the maintainer) and
that the install script will apply these patches, until OSU will check
them in.

> - There appear to be multiple levels of indirection in the 
> MVAPICH build
> scripts -- what directory --prefix is it being installed to?  (this is
> going to be influenced by the answer to the "2 MPI" question, above)
> 
> <Aviram> Will get back to you it.

Ok.  I need this answer before I can provide an Open MPI package for
you.

TK: Each MPI has prefix that depend on MPI vendor and compiler.
Example: OSU MPI compiled with gcc will be installed at:
<package prefix>/mpi/gcc/mvapich-0.9.5-mlx2.0.1/
 
> 4. Gleb sent me a proposed spec file for Open MPI -- we'll 
> iterate about
> this off-list.
> 
> <Aviram> Who will send us the OpenMPI version to be integrated?

Me.  Gleb and I have already iterated a bit; we will definitely have a
new specfile for IBED.

I'm now a bit confused -- Aviram said in a later mail:

> OK. We'll integrate the current one on Sunday, unless we get a new one
> from Jeff till the end of the week. 

What, exactly, are you going to integrate?  Having a single spec file is
not enough.  Are you going to do all the other script and README work?

I thought that I understood what you needed (integration with the *sh
scripts, some kind of megga-tarball with Open MPI and some other
to-be-defined "stuff", etc.), but this statement seems to imply that all
you wanted was a spec file.  

What exactly do you need?

TK: Vlad is waiting for a package from you. If you decide to check-in
code to SVN it's OK with us, if not please send us a tarball.
You need to create spec files that know how to get prefix, path to
include directory and compiler.
I suggest you work directly with Vlad, Pasha and Gleb about all install
definition.
After it's defined please send me the definition so I can add it to the
install document


-- 
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
 
_______________________________________________
openfabrics-ewg mailing list
openfabrics-ewg at openib.org
http://openib.org/mailman/listinfo/openfabrics-ewg



More information about the ewg mailing list