[openib-general] IB_CM_REJ_INVALID_SERVICE_ID

Eric Barton eeb at bartonsoftware.com
Wed Dec 20 14:22:13 PST 2006


Can an rdma_connect be rejected with IB_CM_REJ_INVALID_SERVICE_ID for any other
reason than the peer isn't listening with the correct service number?

I've had the following bug report...

> We are testing 1.6b5 for a InfiniBand cluster with RHEL 4. We use the 
> binaries provides by CFS and use OFED 1.1 as the IB stack.
> 
> At several times some of the clients hang during fs mount or when an OST 
> is added (see log).
> Error:
> LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) 10.0.90.8 at o2ib 
> rejected: reason 8, size 148
> 
> from OFED:
> enum ib_cm_rej_reason {
>        IB_CM_REJ_INVALID_SERVICE_ID            = 8,
> 
> Once an IPoIB ping is started to the corresponding OST the client 
> continues. Afterwards it is quite stable.

...which seems to be saying that just doing an IPoIB ping to the server was
enough to make rdma_connect() work OK.

-- 

                Cheers,
                        Eric





More information about the general mailing list