[ewg] Crash in iw_cxgb3 (Chelsio) in OFED 1.4.1

pandit ib ranjit.pandit.ib at gmail.com
Thu Nov 5 10:21:16 PST 2009


Steve,

This problem happens on OFED 1.4.2 as well. support at chelsio.com also
suggested upgrading to 1.4.2.
We are running our application (non MPI based).
Under reasonable load we get this crash.

Any ideas what might cause the NIC to be unresponsive?
In this case looks like the CQ is having problems?

Ranjit




On Wed, Nov 4, 2009 at 8:28 PM, Steve Wise <swise at opengridcomputing.com> wrote:
> pandit ib wrote:
>>
>> We are seeing a kernel panic in iw_cxgb3 in OFED 1.4.1 (attached screen
>> shot)
>>
>> The problem seems to be happening at....
>>
>> ofa_kernel-1.4.1/drivers/infiniband/hw/cxgb3/cxio_hal.c:112
>>
>>
>>  109                 while (!CQ_VLD_ENTRY(rptr, cq->size_log2, cqe)) {
>>  110                         udelay(1);
>>  111                         if (i++ > 1000000) {
>>  112               -------->        BUG_ON(1);
>>  113                                 printk(KERN_ERR "%s: stalled rnic\n",
>>  114                                        rdev_p->dev_name);
>>  115                                 return -EIO;
>>  116                         }
>>  117                 }
>>
>>
>> Any ideas?
>> Is it a known issue and if so is there a fix or workaround?
>>
>>
>
> Hey Ranjit.
>
> What are you running to produce this problem?
>
> I don't recall any issues in this area...but you might try ofed-1.4.2.
> You should also report this to support at chelsio.com.
>
> Steve.
>
>



More information about the ewg mailing list