[openib-general] APM support in openib stack

Thu Jan 4 10:37:19 PST 2007

> 
> I usually think of IB connections in terms of 3 steps:
> 
> 1. Identify the remote node.
> How do you identify the remote node, and how is that 
> information obtained?

We can either use IP address or LID of the port. They always need a way
to transfer such information from the
Passive side to active side (internally you translate IP to GID/LID,
right? ), or just exchange them.

So either you give the IP on command line(you transfer it manually), or
use other channel to transfer it(this is MPI does, and we use ethernet
network to do it, for MPI job with hundreds of nodes, we must use a
different network(even if it is IPoIB) to transfer such ID info in order
to wire IB connection).

> 
> 2. Obtain a path record between the local and remote node.
> Today, there are two libraries capable of providing this, 
> libibmad and librdmacm.  The userspace MAD library gives 
> greater control, and would likely do so even if a libibsa 
> were created.

We assume there must be at least one path between every process-pair.
For APM, it would be nice to query multiple path.

> 
> 3. Establish a connection.
> 
> IMO, out of band connections should should be prohibited by 

What do you mean 'out of band connections' ?

We need a way to pass IP or LID from one process to another, a third
channel is always needed (if you type the server's IP on client's
command line, you can think you use a manul channel to transfer the IP)

--CQ

> the IB stack, but this would likely break a lot of existing 
> code.  The IB CM is the only agent capable of detecting stale 
> and duplicate connections between nodes.  Without it, 
> applications are more susceptible to data corruption.
> 
> - Sean
>