[ofa-general] rdma cm timeout option, was [iWARP issues]

Caitlin Bestler Caitlin.Bestler at neterion.com
Wed Dec 12 10:48:56 PST 2007



> -----Original Message-----
> From: Talpey, Thomas [mailto:Thomas.Talpey at netapp.com]
> Sent: Wednesday, December 12, 2007 10:41 AM
> To: Caitlin Bestler
> Cc: OpenFabrics General
> Subject: Re: [ofa-general] rdma cm timeout option, was [iWARP issues]
> 
> At 02:15 PM 12/10/2007, Caitlin Bestler wrote:
> >So there is need for *some* mechanism to timeout a stalled iWARP
> connectionion
> >process even after a valid TCP connection is established. A vendor
> >specific, non-
> >configurable, method is probably more than adequate. But *something*
> has to be
> >there, you cannot just rely on the TCP mechanisms. Nor can you assume
> that the
> >TCP stack servicing RDMA connections has the same defaults as the
host
> stack.
> 
> Yes, protection from upper layer timeouts is important. But I'm not
> certain whether you're advocating a CM layer timeout, or a vendor-
> specific requirement in each card's lower stack.
> 
> I tend to prefer the latter - to avoid assumptions in the CM. At
> connection time, CM can't tell the difference between a TCP delay
> and an MPA delay. This could lead to bad assumptions and misleading
> errors.
> 

Either approach makes sense to me. The strongest argument for the latter
is that the "pending MPA" state is one that Consumers will tend not to
be aware of, and will probably never be aware of unless someone actually
uses it for some sort of DoS attack.

So if it is actually handled by vendor specific code then consumers will
be able to remain ignorant of this intermediate step, which is probably
what they would prefer.

Of course that means that OFA, as a whole, cannot be ignorant of this,
because that would inevitably mean that some vendor will forget to do
it.
So it needs to be highlighted in OFA to vendor doc, and ignored in OFA
to Consumer doc.




More information about the general mailing list