[Openib-windows] Possible race in IPOIB while handling queries?

Tzachi Dar tzachid at mellanox.co.il
Thu May 4 09:06:38 PDT 2006


Hi Fab,
 
While debugging failures in IPOIB going up on mem-free cards we have
came to conclusion that there are some race conditions, that have
happened when we received the SM_CHANGE notification many times. This
message was received due to a firmware bug, and we are taking care of
it, but it seems that the handling of queries is still problematic.
 
The problems can be divided to two:
The first issue deals with bringing the port up. We see that once
queries are issued, there is a race condition until they return. If for
any reason the code of port_down is called, there is a try to cancel the
queries, however as the code is only checking p_port->ib_mgr.h_query
there is still a chance that the code of the call back is still running.
This is true for both quarries and the request to join the QP to the
broadcast group.
 
The second problem is for endpoint lookups. It is possible that a query
will go out. During this time the port will go up and down, and the
result of the (old) query will go to the new endpoint manager.
 
What do you think?
 
Thanks
Tzachi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20060504/0328c443/attachment.html>


More information about the ofw mailing list