Olivier, I am having similar issues with the same firmware. Can you give me some more details? Did you make the changes on the driver side or  the application? If on the driver, can you point me in the right direction to make those changes?

<br><br>Thanks,<br>Todd<br><br><div><span class="gmail_quote">On 4/10/07, <b class="gmail_sendername">Olivier Cozette</b> <<a href="mailto:olivier.cozette@seanodes.com">olivier.cozette@seanodes.com</a>> wrote:</span>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">        Hi,<br><br>I had the same error with my driver, and after some investigation, i found

<br>that my srq depth and cq depth was too small to handle the maximum number of<br>send/recv that my application can generate concurently. Normally, in that<br>case the qp state must become error state, but instead of that a catastrophic

<br>error occur.<br><br>I increased the srq/cq depth to meet the maximum send/recv that my application<br>can generate concurently (without reply/synchro) and this bug no more occur.<br><br>So, you probably just need to increase your srq/cq depth and post buffer to

<br>meet the maximum send/recv that your driver can do.<br><br>        Olivier<br><br>Note : I have a MT25204 rev a0 firware 1.2.0.<br><br>Le Mardi 20 Mars 2007 18:59, Eric Barton a écrit:<br>> The following is console output immediately before a panic on a system

<br>> running lustre with OFED 1.1.  How can I find out what it means?<br>><br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0: Catastrophic error detected:<br>> internal error 2007-02-21 12:02:42 ib_mthca 0000:07:

00.0:   buf[00]:<br>> 001d79f4<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[01]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[02]: 00198538<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:

00.0:   buf[03]: 00136038<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[04]: 00207730<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[05]: 001d79cc<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[06]: 0023cf24

<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[07]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[08]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[09]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:

00.0:   buf[0a]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[0b]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[0c]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[0d]: 00000000

<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[0e]: 00000000<br>> 2007-02-21 12:02:42 ib_mthca 0000:07:00.0:   buf[0f]: 00000000<br>><br>> ...shortly before it happens, the lustre/lnet OFED driver receives a number

<br>> of what I believe to be duplicate SEND completion events.  It seems quite<br>> sporadic, and doesn't appear to track hardware.<br>><br>> More info at <a href="https://bugzilla.lustre.org/show_bug.cgi?id=11381">

https://bugzilla.lustre.org/show_bug.cgi?id=11381</a><br>><br>>                 Cheers,<br>>                         Eric<br>><br>><br>> _______________________________________________<br>> general mailing list

<br>> <a href="mailto:general@lists.openfabrics.org">general@lists.openfabrics.org</a><br>> <a href="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

</a><br>><br>> To unsubscribe, please visit<br>> <a href="http://openib.org/mailman/listinfo/openib-general">http://openib.org/mailman/listinfo/openib-general</a><br>_______________________________________________

<br>general mailing list<br><a href="mailto:general@lists.openfabrics.org">general@lists.openfabrics.org</a><br><a href="http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general">http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

</a><br><br>To unsubscribe, please visit <a href="http://openib.org/mailman/listinfo/openib-general">http://openib.org/mailman/listinfo/openib-general</a><br></blockquote></div><br>