[openib-general] Using GM + OpenIB in the same process at the same time.
Roland Dreier
rdreier at cisco.com
Wed Jan 17 13:20:52 PST 2007
[adding openib-general CC]
> I have a question about using GM + OpenIB at the same time, it seems
> to be causing bad things to happen (process goes into state D) :-).
> Here is the issue:
> In Open MPI we allow striping of an MPI message across multiple
> interconnects at once. In this case I am using GM and OpenIB. This is
> using an RDMA pipeline protocol which attempts to overlap
> registration and communication (RDMA Write). In the protocol the
> target registers a chunk of the message and sends an RDMA Write
> request to the origin, the origin then registers the corresponding
> chunk of memory and initiates an RDMA Write. Upon completion of the
> RDMA Write an RDMA FIN message is sent from the origin to the target.
> The target is allowed to have 4 RDMA Write requests outstanding at
> any time.
> As an example, lets say that the user buffer extends from address 3
> through 12200. The target begins by registering lets say address 3 -
> 8000 with OpenIB, under the covers the addresses are page aligned so
> we actually register from 0 through 8191. An RDMA Write request is
> sent to the origin, note that the origin will only RDMA Write into
> addresses 3 - 8000.
> The target then begins registering address 8001 through 12200 with
> GM, again under the covers the addresses are page aligned so we
> actually register from 4096 through 12287 and send an RDMA Write
> request to the origin. Again note that the origin will only RDMA
> Write into address 8001 through 12200.
>
> The problem is that when this occurs the process goes into D state
> (uninterruptible sleep). After this occurs I am still able to use GM
> and OpenIB individually and can even attempt to use them together
> (with the result of the process again going into state D).
Finding out where the process is sleeping would probably be useful.
You can do "cat /proc/<pid>/wchan" to get a little info.
Even better would be to to "echo t > /proc/sysrq-trigger" and send the
complete kernel log messages that that produces (and also include the
PID that is stuck in uninterruptible sleep).
However I think it will probably be up to myricom to debug this in the
end -- my ability to figure out what's happening is very limited
without the GM sources, and I'm not that interested in debugging
someone else's proprietary software anyway.
- R.
More information about the general
mailing list