[openib-general] kernel oops

Sean Hefty mshefty at ichips.intel.com
Tue Aug 30 09:04:32 PDT 2005


Hal Rosenstock wrote:
> Why would ib_at_paths_by_route be called if no route were obtained (from
> ib_at_route_by_ip) ? Isn't that a ucmpost issue ? (I also agree it's not
> good for UAT to crash).

The assumption that I made was that the call to ib_at_route_by_ip() 
would fail if given an invalid route.  Also, since ucmpost is a simple 
test app designed more to test the CM than AT, I kept error testing to a 
minimum.

> It needs to be a valid route struct. I'm not sure how the kernel can
> validate that is the case. It does check for NULL pointer but this is
> bad pointer.

Struct ib_at_ib_route should probably change the struct ibv_device 
*out_dev field.  It looks like this field is actually set to a struct 
ib_device * that is a kernel pointer.  Can we just remove this field and 
use the sgid to locate the correct device structure in the kernel, or 
fail if it cannot be located?

>>The AT code appears to passing a kernel pointer up to the userspace app, 
>>and then requires that pointer to be passed back to the kernel.  This 
>>Needs to be changed to pass up some identifier that can be validated on 
>>the return to the kernel.
> 
> Isn't it copying the ib_route structure to userspace ?

Yes - but that contains the kernel device pointer.  And looking at it 
more, the ABI contains pointers in the data structures.  This should 
cause problems with 32-bit apps running on 64-bit kernels.

I'm not sure how desirable it is to fix these issues versus moving to 
whatever the new CM abstraction API is.

- Sean



More information about the general mailing list