[openib-general] Re: SRQ freezes up

Roland Dreier rolandd at cisco.com
Sun Oct 30 09:52:49 PST 2005


>>>>> "Ami" == Ami Parlmuter <amip at mellanox.co.il> writes:

    Ami> running ibv_srq_pingpong pops up two bugs in the SRQ.  1.  a
    Ami> failure to RRs to the SRQ after polling completions sent to
    Ami> it (the verb ibv_post_srq_recv fails returning -1) 2.  as a
    Ami> direct result of this, the other side gets a bad completion
    Ami> with RETRY EXCEEDED error, and then the machine freezes up

Anything printed in the console from the kernel when this happens?

    Ami> the first bug has been there for quit some time,

Any reason you kept it a secret until now?

    Ami> the second only happens from REV 3890 (when the previous
    Ami> version I tested was 3382)


I wasn't able to duplicate the exact symptoms you see, but I fixed a
couple of bugs that your test showed for me: one in the uverbs kernel
module that can cause a kernel panic, and one in the srq_pingpong
example that would cause a CQ overrun.

Do you still see problems with the latest svn code?

 - R.



More information about the general mailing list