[ofa-general] rdma_resolve_route() returning -EINVAL

Talpey, Thomas Thomas.Talpey at netapp.com
Thu Oct 2 10:39:56 PDT 2008


I'm debugging a reconnect problem in the NFS/RDMA client and
am seeing something rather odd. The context is that if a client
mount point goes idle for 5 minutes, the Linux RPC layer closes
the associated connection. When a new request needs to be
sent, the RPC layer then performs a reconnect.

At this point, the NFS/RDMA client code will call rdma_create_id()
to create a new rdma_cm_id, then rdma_resolve_addr() and
finally rdma_resolve_route(). In the reconnect scenario, that
last step however returns -EINVAL.

Looking at the code, I think the only reasons for this return are
1) calling rdma_resolve_route() in the wrong state (which I'm not),
and 2) way down in the ib_post_send_mad() function, if there is
a timeout passed-in (which there is) and there's no receive handler
registered for the MAD (no clue but it worked the first time).

This is using the ib_mthca driver, and 2.6.27-rc7 btw. Any clues to
help figure out what might be wrong?

Thanks,
Tom.




More information about the general mailing list