[openib-general] [CM] possible problem with crossing DREQs.

Hal Rosenstock halr at voltaire.com
Fri Jun 10 06:04:55 PDT 2005


On Thu, 2005-06-09 at 15:47, Sean Hefty wrote: 
> >  I'm seeing an unusual problem when both halves of a connection
> >actively disconnect at the same time. Each connection peer issues
> >a DREQ at the same time, next each receive the DREQ and responds
> >with a DREP, and finally each connection gets a callback for the
> >transition to the idle state. However, at this point it appears
> >that each CM keeps retransmitting DREQ requests, which then seems
> >to interfere with new connection establishment.
> 
> I think that I understand what's happening.  Receiving the DREQ
> changed the state of the cm_id, but did not cancel the previous send.
> 
> I'm actually out on vacation for a little over two weeks (and will
> be totally away from e-mail after Friday), but something
> like the patch below might fix the issue.  (Note that I didn't test /
> compile this.)  If it does work for you, feel free to commit it.

This works for me. My test case is a little different. It is repeated
loopback kdapl quit tests but it does resolve the same problem. I am
comitting this change. Thanks.

-- Hal





More information about the general mailing list