[ofa-general] rdma_resolve_route() returning -EINVAL

Hal Rosenstock hal.rosenstock at gmail.com
Thu Oct 2 15:29:39 PDT 2008


Tom,

On Thu, Oct 2, 2008 at 1:39 PM, Talpey, Thomas <Thomas.Talpey at netapp.com> wrote:
> I'm debugging a reconnect problem in the NFS/RDMA client and
> am seeing something rather odd. The context is that if a client
> mount point goes idle for 5 minutes, the Linux RPC layer closes
> the associated connection. When a new request needs to be
> sent, the RPC layer then performs a reconnect.
>
> At this point, the NFS/RDMA client code will call rdma_create_id()
> to create a new rdma_cm_id, then rdma_resolve_addr() and
> finally rdma_resolve_route(). In the reconnect scenario, that
> last step however returns -EINVAL.
>
> Looking at the code, I think the only reasons for this return are
> 1) calling rdma_resolve_route() in the wrong state (which I'm not),
> and 2) way down in the ib_post_send_mad() function, if there is
> a timeout passed-in (which there is) and there's no receive handler
> registered for the MAD (no clue but it worked the first time).

Are you saying you're suspecting reason 2 above ? FWIW, my read
relative to ib_post_send_mad is that CM does register a receive
handler so I don't think -EINVAL comes from there. Are you actually
seeing the lack of a receive handler or is it from reviewing the code
looking from where -EINVAL could possibly come ?

-- Hal

> This is using the ib_mthca driver, and 2.6.27-rc7 btw. Any clues to
> help figure out what might be wrong?
>
> Thanks,
> Tom.
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list