[openib-general] Using GM + OpenIB in the same process at the same time.

Roland Dreier rdreier at cisco.com
Wed Jan 17 13:20:52 PST 2007


[adding openib-general CC]

 > I have a question about using GM + OpenIB at the same time, it seems
 > to be causing bad things to happen (process goes into state D) :-).
 > Here is the issue:
 > In Open MPI we allow striping of an MPI message across multiple
 > interconnects at once. In this case I am using GM and OpenIB. This is
 > using an RDMA pipeline protocol which attempts to overlap
 > registration and communication (RDMA Write). In the protocol the
 > target registers a chunk of the message and sends an RDMA Write
 > request to the origin, the origin then registers the corresponding
 > chunk of memory and initiates an RDMA Write.  Upon completion of the
 > RDMA Write an RDMA FIN message is sent from the origin to the target.
 > The target is allowed to have 4 RDMA Write requests outstanding at
 > any time.
 > As an example, lets say that the user buffer extends from address 3
 > through 12200. The target begins by registering lets say address 3 -
 > 8000 with OpenIB, under the covers the addresses are page aligned so
 > we actually register from 0 through 8191. An RDMA Write request is
 > sent to the origin, note that the origin will only RDMA Write into
 > addresses 3 - 8000.
 > The target then begins registering address 8001 through 12200 with
 > GM, again under the covers the addresses are page aligned so we
 > actually register from 4096 through 12287 and send an RDMA Write
 > request to the origin. Again note that the origin will only RDMA
 > Write into address  8001 through 12200.
 > 
 > The problem is that when this occurs the process goes into D state
 > (uninterruptible sleep). After this occurs I am still able to use GM
 > and OpenIB individually and can even attempt to use them together
 > (with the result of the process again going into state D).

Finding out where the process is sleeping would probably be useful.
You can do "cat /proc/<pid>/wchan" to get a little info.

Even better would be to to "echo t > /proc/sysrq-trigger" and send the
complete kernel log messages that that produces (and also include the
PID that is stuck in uninterruptible sleep).

However I think it will probably be up to myricom to debug this in the
end -- my ability to figure out what's happening is very limited
without the GM sources, and I'm not that interested in debugging
someone else's proprietary software anyway.

 - R.




More information about the general mailing list