[ofa-general] Re: CMA can't establish connection with QoS on
Sean Hefty
sean.hefty at intel.com
Tue Jan 8 16:26:59 PST 2008
>I updated the bug with the step-by-step instructions how to burn
>the FW and reproduce the error.
>I compiled this "how-to" today, so everything there is up to date.
Thanks - I don't think that I was programming my FW correctly. I still have
problems running opensm with qos enabled on one of my systems, but I can get it
to work running on the other system.
Anyway, I was able to reproduce the problem, and I believe I understand part of
the problem. The send for the CM REQ MAD never completes. A completion never
shows up on the GSI's CQ with a wr_id that matches the send wr_id. (I don't see
a completion at all.) This results in a reference being held on the ib_cm id
that is never released, which causes the hang. (Destruction of the ib_cm id
hangs, which blocks the destruction of the rdma_cm_id, which blocks the close
from userspace.)
If the ib_cm is modified to use SL 0 for the CM MADs, but the connection still
uses SL 1, then ucmatose is able to connect and transfer data between the client
and server.
- Sean
More information about the general
mailing list