[openib-general] [PATCH 3/5 v3] 2.6.20 rdma/cma: allow early transition to RTS to handle lost CM messages

Michael S. Tsirkin mst at mellanox.co.il
Thu Dec 7 01:54:18 PST 2006


> Quoting r. Sean Hefty <sean.hefty at intel.com>:
> Subject: Re: [PATCH 3/5 v3] 2.6.20 rdma/cma: allow early transition to RTS to handle lost CM messages
> 
> >Yes, I've even observed this with SDP, but I'm not sure why this
> >happens. It seems that MADs are sometimes lost even in back to back
> >configurations. Any idea why?
> 
> I have no idea why MADs would be lost.  In our scale up testing, we *never* saw
> lost or dropped MADs to the SA node, even when hitting it with 500,000 queries.
> The fact that you're seeing lost MADs is something that we should probably look
> into more, someday, hopefully, when I have more time available...  We didn't
> notice any issues with the CM messages in our testing, so we didn't examine that
> traffic in more detail.

Note I only see CM message drops.
I had to use rdma_establish and send an extra send after start in SDP
to trigger it, but path resolution was always working fine.

> Are there counters for QP0/1 that can let us know whether drops are occurring on
> the send or receive side?

Not sure what do you mean. Let's just count the send/receive completions in MAD layer.

-- 
MST




More information about the general mailing list