[openib-general] IB_CM_REJ_INVALID_SERVICE_ID
Sean Hefty
mshefty at ichips.intel.com
Wed Dec 27 11:50:33 PST 2006
Eric Barton wrote:
> Can an rdma_connect be rejected with IB_CM_REJ_INVALID_SERVICE_ID for any other
> reason than the peer isn't listening with the correct service number?
This should only occur if the remote peer isn't listening. This reject code is
automatically sent by the ib_cm when a request does not find a corresponding listen.
>>We are testing 1.6b5 for a InfiniBand cluster with RHEL 4. We use the
>>binaries provides by CFS and use OFED 1.1 as the IB stack.
>>
>>At several times some of the clients hang during fs mount or when an OST
>>is added (see log).
>>Error:
>>LustreError: 1776:0:(o2iblnd_cb.c:2314:kiblnd_rejected()) 10.0.90.8 at o2ib
>>rejected: reason 8, size 148
Is this event = 8 and status = 8?
>>
>>from OFED:
>>enum ib_cm_rej_reason {
>> IB_CM_REJ_INVALID_SERVICE_ID = 8,
>>
>>Once an IPoIB ping is started to the corresponding OST the client
>>continues. Afterwards it is quite stable.
>
>
> ...which seems to be saying that just doing an IPoIB ping to the server was
> enough to make rdma_connect() work OK.
I can't explain the relationship between the ping and the connect starting to work.
- Sean
More information about the general
mailing list