[Openib-windows] Handling a few arp sent together

Tzachi Dar tzachid at mellanox.co.il
Mon Aug 7 07:54:03 PDT 2006


Hi Fab,
 
While doing some tests on SDP and iperf which involved 5 simultaneous
connections, I have found out that there is a problem that might cause
connections to fail from time to time.
 
It seems that the problem was caused because the function
ipoib_mac_to_gid was not always returning the dest GID.
It seems that the problem is caused in the function __recv_arp. The main
problem is that when a new end point is accepted, we try to check that
the end point that we have is still valid. The comment says:
 /*
  * If the endpoint exists for the GID, make sure
  * the dlid and qpn match the arp.
  */
and later in the code we check the src_hw.gid and dlid and the qp
number. My problem starts from the fact that if this is only arp that is
done the dlid will be 0 (we take it from the callback of the query path
record) and therefore the endpoint is removed. A few lines later such an
end point is created, and inserted again, however there is a time window
that such an entry doesn't exist (and therfore there is no answer to the
query). 
 
I have tried to replace the check:
 
  else if( (((*pp_src)->dlid != p_wc->recv.ud.remote_lid)  ||
   (*pp_src)->qpn != p_wc->recv.ud.remote_qp) )

with 
 
  else if( (((*pp_src)->dlid != p_wc->recv.ud.remote_lid)  &&
((*pp_src)->dlid != 0) ||
   (*pp_src)->qpn != p_wc->recv.ud.remote_qp) )
and it seems to solve my problem. Please note that the exact endpoint
will be created only a few lines bellow.
 
So, do you see any problem with the solution proposed? 
Thanks
Tzachi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20060807/14673563/attachment.html>


More information about the ofw mailing list