<font size="2">Hi, <br>
</font><font size="2"><br>
I'm trying out APM with OFED 1</font><font size="2">.2 , using Mellanox dual-port HCA (ib_mthca driver). </font><font size="2">When I have several RCQP's that I am trying to migrate (software triggered migration using ib_modify_qp), I've
noticed that sometimes 1 or 2 of the remote QP's never generate an
IB_EVENT_PATH_MIG or even an IB_EVENT_PATH_MIG_ERR ... it seems that it
just gets lost. </font><font size="2">I
looked through some of the ib_mthca patches in
<a href="http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git">git.kernel.org/?p=linux/kernel/git/roland/infiniband.git</a>, and
incorporated the mmiowb patch for ib_mthca commands
(<a href="http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commit;h=76d7cc0345a037e8eea426f8abc710abd22946dd">http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commit;h=76d7cc0345a037e8eea426f8abc710abd22946dd
</a>).
But still seeing same issue. I have a test case that repeates
software-triggered migrations + rearming in a loop, and this problem
usually occurs in the first few cycles, but is not too frequent. If
anyone has a</font><font size="2">ny ideas on what might be wrong, or tips on where I can
look/do to debug this, that would be very much appreciated! </font><br>
<font size="2">
<br>
For example, this is the console output I will see (printed out by our rcqp event handler): <br>
On the local end - initiates software triggered migration, using ib_modify_qp: <br>
Event IB_EVENT_PATH_MIG occurred on QP#1043<br>
Event IB_EVENT_PATH_MIG occurred on QP#1040<br>
Event IB_EVENT_PATH_MIG occurred on QP#1033<br>
<br>
On the remote end: <br>
Event IB_EVENT_PATH_MIG occurred on QP#1040<br>
Event IB_EVENT_PATH_MIG occurred on QP#1043<br>
<br>
Thanks so much for any pointers!<br>
Lan <br>
<br>
<br>
</font>