[openib-general] RDMA_CM_EVENT_UNREACHABLE(-ETIMEDOUT)
Eric Barton
eeb at bartonsoftware.com
Tue Aug 1 23:04:47 PDT 2006
I've had a report of rdma_connect() failing with a callback event type of
RDMA_CM_EVENT_UNREACHABLE and status -ETIMEDOUT although the peer node was
up and running at the time.
It seems this can be reproduced as follows...
1. Establish a connection between nodes A and B
2. Reboot node A
3. Start establishing a new connection from node A to node B
4. After a timeout, the CM callback occurs as described.
Could this happen with a buggy SM? Are there some good places in the
OpenFabrics stack to add printks to help point the finger (or can some
existing debug/trace code be enabled)?
--
Cheers,
Eric
More information about the general
mailing list