[PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun

Michael S. Tsirkin mst at mellanox.co.il
Mon Dec 20 08:01:46 PST 2004


Hello!
Quoting r. Roland Dreier (roland at topspin.com) "Re: [PATCH] Re: [openib-general] Re: IPoIB Failure CQ overrun":
>     Michael> In investigating this issue I discovered what I belive is
>     Michael> a race condition in mthca:
> 
> Thanks, good catch.  I'll apply your patch.  In the future can you add
> a Signed-off-by: line to your patches?

Sorry,I forgot it. Here it is for the last patch:

Signed-off-by: Michael S. Tsirkin <mst at mellanox.co.il>

>     Michael> I also would like to suggest implementing CQ doorbell
>     Michael> coalescing in mthca, to reduce the number of CQ
>     Michael> doorbells.
> 
> Sounds like a good idea...
> 
>     Michael> Unfortunately this patch does not seem to solve the
>     Michael> overrun problem, so may be another problem. That will
>     Michael> need more looking into.
> 
> OK.  At this point do you think it's a FW problem or a driver problem?
> 
> Thanks,
>   Roland

CQ consumer index doorbell FW is reasonably well tested with VAPI (and with
directed tests). It is also relatively straight-forward code so I would
suspect a driver problem first of all.

Unfortunately once the overrun happends I can not bring the interface
down nor unload the ip over ib module (both commands hang) so I have to
reboot. This is slowing me down considerably.
Do you have an idea why is that, and how to fix this problem?

MST



More information about the general mailing list