[openib-general] kernel oops

Hal Rosenstock halr at voltaire.com
Tue Aug 30 16:36:52 PDT 2005


On Tue, 2005-08-30 at 12:04, Sean Hefty wrote:
> Hal Rosenstock wrote:
> > Why would ib_at_paths_by_route be called if no route were obtained (from
> > ib_at_route_by_ip) ? Isn't that a ucmpost issue ? (I also agree it's not
> > good for UAT to crash).
> 
> The assumption that I made was that the call to ib_at_route_by_ip() 
> would fail if given an invalid route.

That seems reasonable (but I haven't tried this but will once I get some
spare cycles).

>   Also, since ucmpost is a simple 
> test app designed more to test the CM than AT, I kept error testing to a
                                                              ^^^^^^^
                                                              handling
> minimum.
> 
> > It needs to be a valid route struct. I'm not sure how the kernel can
> > validate that is the case. It does check for NULL pointer but this is
> > bad pointer.
> 
> Struct ib_at_ib_route should probably change the struct ibv_device 
> *out_dev field.  It looks like this field is actually set to a struct 
> ib_device * that is a kernel pointer.

Ah, that's the kernel pointer you were referring to. [I missed that
before.]

> Can we just remove this field and 
> use the sgid to locate the correct device structure in the kernel, or 
> fail if it cannot be located?

That seems like a good idea.

> >>The AT code appears to passing a kernel pointer up to the userspace app, 
> >>and then requires that pointer to be passed back to the kernel.  This 
> >>Needs to be changed to pass up some identifier that can be validated on 
> >>the return to the kernel.
> > 
> > Isn't it copying the ib_route structure to userspace ?
> 
> Yes - but that contains the kernel device pointer.  And looking at it 
> more, the ABI contains pointers in the data structures.  This should 
> cause problems with 32-bit apps running on 64-bit kernels.
> 
> I'm not sure how desirable it is to fix these issues versus moving to 
> whatever the new CM abstraction API is.

Won't AT still be needed under the new CM abstraction for IB ? I guess
the answer is unclear. It still seems to me that it should be fixed
until there is something else to take its place. Do you concur ?

-- Hal




More information about the general mailing list