[Openib-windows] A problem in ib_close_al
Leonid Keller
leonid at mellanox.co.il
Tue Jul 25 01:50:34 PDT 2006
> -----Original Message-----
> From: ftillier.sst at gmail.com [mailto:ftillier.sst at gmail.com]
> On Behalf Of Fabian Tillier
> Sent: Tuesday, July 25, 2006 4:11 AM
> To: Leonid Keller
> Cc: openib-windows at openib.org
> Subject: Re: [Openib-windows] A problem in ib_close_al
>
> Hi again Leo,
>
> On 7/23/06, Leonid Keller <leonid at mellanox.co.il> wrote:
> > Hi Fab,
> > Seems like I found the reason of the stuck on shutdown.
> > Find attached 2 patches for problems, which I come across on during
> > investigating of this case.
> > Here are short description.
> > 1. (a bug responsible for the stuck)
> > If a send MAD times out, it sends once more, so one
> can get 2
> > responds for it.
>
> I'm confused here - the code will retry a send only as many
> times as specified by the retry_cnt field. I don't see where
> the extra send comes from. Can you explain?
I didn't check retry_cnt and I'm not sure, I can explain, why it gets
here, but it does.
>
> I do however see that a timeout of preceding send could
> result in a retry, and two responses could be received before
> that send completes.
> This however seems extremely unlikely, and that is the only
> time that the response MAD could be leaked. It's not
> impossible, though, so the check you suggest is correct -
> I've committed a similar fix in revision 429.
>
> Please let me know if this solves the leak or if there is
> still some other issue.
>
> Thanks,
>
> - Fab
>
Thank you, we'll check.
More information about the ofw
mailing list