[ofa-general] Re: [ewg] RFC: Do we wish to take MPI out of OFED?

Gus Correa gus at ldeo.columbia.edu
Tue Jun 9 09:01:00 PDT 2009


Hi OFA list

I never post to this list, but I read it.
I am just reporting how we use MPI here,
in case this matters.
I am not trying to flare any tempers.

I administer (and use) a cluster with Infiniband and GigE networks.
We use OFED, OpenMPI and MVAPICH2 (and even MPICH2 on our GigE network).

We don't use the MPI versions that come bundled with OFED,
not even for tests.
We don't use other MPIs (TCP/IP based) that come with Linux 
distributions or compiler packages, either.
The reasons are the same in both cases.

We prefer to stay up to date with the latest stable OpenMPI and MVAPICH2 
releases (and MPICH2 also).
Moreover, due to compiler incompatibilities and specific application 
requirements (some only compile and run with certain compilers, or run 
faster with certain compilers), we tend to build all MPIs
with a few compilers and compiler combinations.
We install these builds on specific (non standard) directories
on our system, managing the user environment with the
environment modules package.
That is what is most convenient and rational for us.

Correct me if I am wrong, please,
but my perception is that much larger Linux clusters
(e.g. TACC Ranger, etc), have a similar setup.

My $0.02.

Thank you,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Jeff Squyres wrote:
> On Jun 8, 2009, at 6:59 AM, Todd Rimmer wrote:
> 
>> I agree with DK from OSU.  There are clear advantages to having MPI 
>> included with OFED.  Not only will it make testing of a complete 
>> solution easier by both OFED and MPI suppliers,
>>
> 
> Can you specify how, specifically?
> 
> Remember that all that Open MPI and MVAPICH do is provide SRPMs.  There 
> is no co-mingling of development / source trees, for example.  You seem 
> to be blurring the distinction between co-development of MPI+OpenFabrics 
> and shipping OFED.  Developing the two together is a Good Thing -- and 
> that happens.  But that is unrelated to shipping the MPI's in OFED.
> 
> As has been specified multiple times on this thread, using MPI to test 
> verbs is a Good Thing and it can easily be maintained without 
> distributing MPI in OFED.
> 
>> but it will also improve ease of use for end users.
>>
> 
> Can you specify how, specifically?
> 
> Recall that:
> 
> - Open MPI users get a stripped-down version with several important 
> features disabled
> - At least one user has chimed in that they install MPI separately from 
> OFED for a variety of reasons (I have seen this at customer sites as well)
> 
>> As DK points out there are continual improvements in MPIs which may 
>> depend on bug fixes and/or new features in newer versions of OFED. 
>> Identifying a known good combination will be important to most end 
>> users, etc.
>>
> 
> Easy to do in documentation and/or in the technology of the MPI 
> implementations themselves.  The verbs API should allow this kind of 
> run-time checking as a matter of course (and it seems to allow it well 
> enough).
> 
> Additionally -- and your later comments seemed to support it -- possibly 
> the most important combination that needs to work is that of <latest 
> OFED> + <latest MPI>, which will continue to work because OFED would be 
> insane to remove MPI from its testing/QA/release process.
> 




More information about the general mailing list