[ofa-general] Re: [RFC][PATCH] last wqe event handler patch

Shirley Ma xma at us.ibm.com
Wed Jun 25 12:57:08 PDT 2008


Roland Dreier <rdreier at cisco.com> wrote on 06/25/2008 11:17:22 AM:

>  > 1. QP100 last WQE reached event, QP100 context is added into 
flush_list,
>  > and then it is put into drain_list, and does post_send of a drain WR.
>  > 2. QP200 last WQE reached event, QP200 context is added into 
flush_list,
>  > but not drain_list since only one drain WR will be posted
>  > 3. QP300 ...., QP300 context is added into flush_list, but not 
drain_list
>  > 
>  > So QP100 is on drain_list, QP200, QP300 are on flush_list
>  > 
>  > In rcq poll_cq,
>  > 1. QP 100 drain WR cqe is polled, it will put QP100 into reap_list 
then
>  > call ipoib_cm_start_rx_drain(), post_send of QP200 drain WR, and 
QP200,
>  > QP300 are both moved from flush_list to drain_list
>  > 2. QP 200 drain WR cqe is polled, it will move both QP200 and QP300 
from
>  > drain_list to reap_list
>  > 3. QP300 cqe comes, but QP300 context has been freed, ---> panic.
>  > 
>  > Does that make sense?
> 
> This is a really good explanation (exactly what I would hope to see in a
> changelog).  I'm not positive I understand though: is the issue that
> when the QP300 last WQE reached event is seen, we are not guaranteed
> that we have handled all CQEs for QP300 yet?
> 
>  - R.

Right. We see QP300 last WQE reached event from async event handler, all 
remaining cqes for QP300 not being processed are in recv completion queue. 
When QP300 QP is destroyed within QP200 drain WRs context, not in QP300 
drain WR context. There is a window QP300 has been destoryed but some 
QP300 cqes might still be in completion queue and not being processed yet.

Thanks
Shirley



More information about the general mailing list