[openib-general] IPoIB oops on path record completion
Hal Rosenstock
halr at voltaire.com
Wed Dec 15 17:18:21 PST 2004
On Wed, 2004-12-15 at 20:14, Roland Dreier wrote:
> Hal> This is due to the following: ib_sa_path_rec_callback:
> Hal> sa_query 0xc0db0788 status 0xffffff92 mad 0x00000000 which
> Hal> invokes query-> callback(status, NULL, query->context);
>
> Hal> ipoib_main.c: static void path_rec_completion(int status,
> Hal> struct ib_sa_path_rec *pathrec, void *path_ptr)
>
> Hal> path_rec_completion is using the pathrec parameter as a
> Hal> pointer without checking it for NULL first.
>
> Hmm... are you sure this is what causes the oops?
No but it definitely oops in that callback. I didn't trace it in
path_rec_completion; only glanced at the code. Don't the debug
statements deference through NULL regardless of status ?
> path_rec_completion() will only dereference the pathrec parameter if
> its local variable ah is non-NULL:
>
> if (ah) {
> path->pathrec = *pathrec;
>
> and ah can only be set to non-NULL if status is successful (ah is
> initialized to NULL and the only place it can be changed is
>
> ah = ipoib_create_ah(path->dev, priv->pd, &av);
>
> which is inside a test of status.
>
> Can you give the exact sequence you use to duplicate this? I haven't
> been able to make it happen in my network.
Have you gotten a negative status on the callback (and NULL pathrec) ?
I've yet to see this response on the analyzer as there are too many to
go through right now. I do see it with extra debug I put in to narrow
this down.
> Hal> Also, what I do see when I do a broadcast ping is that the
> Hal> path record is obtained over and over rather than being
> Hal> requested once and cached. Is that what is supposed to be
> Hal> happening now ?
>
> No, that shouldn't happen. I'll try to figure out what's happening.
Thanks.
-- Hal
More information about the general
mailing list