[openib-general] Re: [PATCH] libmthca: fix wqe post

Roland Dreier rolandd at cisco.com
Tue Sep 13 17:53:35 PDT 2005


    Viswanath> When I ran the cmpost program which I sent you, I
    Viswanath> started getting errors from the mthca library even for
    Viswanath> smaller number of connections (Earlier it was
    Viswanath> working).

Yeah, I found another problem with your cmpost program.  I think
you're setting the packet lifetime far too low.  You have:

	sa.packet_life_time          = 2;

This ends up having the CM set an ACK timeout of something like 32
microseconds, which is way too low.  If you poll the send CQ, you'll
probably see some "retries exceeded" errors.  Setting the
packet_life_time to something like 14 or 15 should work better.

    Viswanath> Also it is now easier to create the panic when you kill
    Viswanath> the cmpost server program. The panic may be happening
    Viswanath> on an error path.

I still have never been able to reproduce this panic (and believe me,
I've killed the cmpost program many time).  Anyway, I'll take a look
at the traceback and see if anything jumps out at me.

 - R.



More information about the general mailing list