[Openib-windows] Handling a few arp sent together
Tzachi Dar
tzachid at mellanox.co.il
Mon Aug 7 07:54:03 PDT 2006
Hi Fab,
While doing some tests on SDP and iperf which involved 5 simultaneous
connections, I have found out that there is a problem that might cause
connections to fail from time to time.
It seems that the problem was caused because the function
ipoib_mac_to_gid was not always returning the dest GID.
It seems that the problem is caused in the function __recv_arp. The main
problem is that when a new end point is accepted, we try to check that
the end point that we have is still valid. The comment says:
/*
* If the endpoint exists for the GID, make sure
* the dlid and qpn match the arp.
*/
and later in the code we check the src_hw.gid and dlid and the qp
number. My problem starts from the fact that if this is only arp that is
done the dlid will be 0 (we take it from the callback of the query path
record) and therefore the endpoint is removed. A few lines later such an
end point is created, and inserted again, however there is a time window
that such an entry doesn't exist (and therfore there is no answer to the
query).
I have tried to replace the check:
else if( (((*pp_src)->dlid != p_wc->recv.ud.remote_lid) ||
(*pp_src)->qpn != p_wc->recv.ud.remote_qp) )
with
else if( (((*pp_src)->dlid != p_wc->recv.ud.remote_lid) &&
((*pp_src)->dlid != 0) ||
(*pp_src)->qpn != p_wc->recv.ud.remote_qp) )
and it seems to solve my problem. Please note that the exact endpoint
will be created only a few lines bellow.
So, do you see any problem with the solution proposed?
Thanks
Tzachi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20060807/14673563/attachment.html>
More information about the ofw
mailing list