[openib-general] [PATCH 3/5 v3] 2.6.20 rdma/cma: allow early transition to RTS to handle lost CM messages
Michael S. Tsirkin
mst at mellanox.co.il
Thu Dec 7 01:54:18 PST 2006
> Quoting r. Sean Hefty <sean.hefty at intel.com>:
> Subject: Re: [PATCH 3/5 v3] 2.6.20 rdma/cma: allow early transition to RTS to handle lost CM messages
>
> >Yes, I've even observed this with SDP, but I'm not sure why this
> >happens. It seems that MADs are sometimes lost even in back to back
> >configurations. Any idea why?
>
> I have no idea why MADs would be lost. In our scale up testing, we *never* saw
> lost or dropped MADs to the SA node, even when hitting it with 500,000 queries.
> The fact that you're seeing lost MADs is something that we should probably look
> into more, someday, hopefully, when I have more time available... We didn't
> notice any issues with the CM messages in our testing, so we didn't examine that
> traffic in more detail.
Note I only see CM message drops.
I had to use rdma_establish and send an extra send after start in SDP
to trigger it, but path resolution was always working fine.
> Are there counters for QP0/1 that can let us know whether drops are occurring on
> the send or receive side?
Not sure what do you mean. Let's just count the send/receive completions in MAD layer.
--
MST
More information about the general
mailing list