[openib-general] APM support in openib stack

Tang, Changqing changquing.tang at hp.com
Thu Jan 4 07:49:08 PST 2007


We are currently happy with Verbs API to wire the IB connection,
without libibcm.so and librdmacm.so, the drawback of this method is that
it
Requires an alltoall exchange of QP number among all process, it is OK
to static MPI world.

When we come to dynamic process, there are two groups of MPI processes,
within each group, IB connection has already established, we want
To establish IB connection between the two groups, and the size of each
group is dynamic. We can use the current method, but it requires
Several rounds of message exchange(simulate alltoall), so we hope to
have a connect/accept style method to establish IB connection.

If we require system to have IPoIB on each port, and if there are two
cards( two ports each), plus an ethernet, then 5 IP addresses must be
Configured on a node.

We want to have as little requirements as possible, and performance
consideration(I asked you before which method is faster to setup IB
connection),
So what I am thinking of is:
	1. for static MPI job, don't use libibcm.so and librdmacm.so
	2. for dynamic MPI job, add to use libibcm.so only, no IPoIB is
required.

If we come to iWARP, it is another story.


--CQ



> -----Original Message-----
> From: Or Gerlitz [mailto:ogerlitz at voltaire.com] 
> Sent: Thursday, January 04, 2007 8:30 AM
> To: Tang, Changqing
> Cc: Sean Hefty; openib-general at openib.org
> Subject: Re: [openib-general] APM support in openib stack
>
> Tang, Changqing wrote:
> > Sorry, I find the function  'ib_sa_path_rec_get()'  in kernel code. 
> > Then here is my question:
> > 
> > Is there any way (instruction) to fill in struct 'ib_sa_path_rec' 
> > inside struct 'ib_cm_req_param' without using librdmacm.so ?
> 
> Hi CQ,
> 
> I understand that you considering to go on an approach which 
> does not involve librdmacm, so you would probably like to
> 
> 	+ use IB PORT GIDs at your initial mpi init exchange
> 	+ issue IB SA Path query via libibsa (which does not exist)
> 	+ establish IB RC connection (listen/connect/accept) via libibcm
> 
> Please note that such an approach is possible even with the 
> non existence of libibsa (similarly to what IB MPIs do today 
> use IB PORT LIDs [note you would need both GIDs & LIDs to 
> have the CM working fine] at your ranks pre exchange and 
> hardcode the other IB PATH params such as MTU, PKEY and SL 
> you set later into the IB RC QP.
> 
> However, a no rdma cm approach means you need to apply hacks 
> to "guess" 
> the correct pkey, mtu and sl and some more limitations that 
> eventually you would face when coming to advanced IB 
> deployment environments.
> 
> All you have to do to use the rdma cm to require a functional 
> IPoIB NIC on each of the active IB PORTs, which is a trivial 
> requirement from the users in the Ethernet world, so why not 
> apply it here as well ???
> 
> On the other hand moving to use the IB CM instead of 
> emulating it via TCP is some progress...
> 
> Or.
> 
> 




More information about the general mailing list