[openib-general] new IB CM reject reason

Wed Jan 31 22:24:31 PST 2007

> Quoting Sean Hefty <sean.hefty at intel.com>:
> Subject: new IB CM reject reason
> 
> We've hit into an issue with the IB CM reject reason codes.  When a remote
> application crashes during connection establishment, the connection will be
> rejected by the kernel CM.  Unfortunately, there's not a decent reject reason
> that maps to this event.  Currently, the ib_cm issues the reject as consumer
> defined (code 28).
> 
> I'd like to propose adding reject reason 0, which would mean other/unknown/or
> none given.  This is a deviation from the spec, but does anyone know of any
> issues with such an approach?

No, I don't think "application crashed" makes sense as an element of wire protocol.
I think an optional logging of errors in kernel CM would be a much better
solution. I know I had to add some printks it each time I was debugging SDP.

A couple of reasons that come to mind:

1. Should the remote side care whether remote implementation is kernel or
userspace? Userspace consumers are not the only ones of interest. What about various error
codes and failure values returned from callback on passive side?
If you are trying to develop a debug aid, these need to be covered as well.

2. Another objection is that this feature seems to invite misuse where applications
will use REJ reason as a hint on whether remote side crashed. But REJ could be
lost. Wouldn't this confuse the remote side?

-- 
MST