[openib-general] kernel oops
Sean Hefty
mshefty at ichips.intel.com
Mon Aug 29 15:24:14 PDT 2005
Viswanath Krishnamurthy wrote:
> Call Trace:
> [<c013e410>] __alloc_pages+0x166/0x3b6
> [<c0267637>] ib_get_client_data+0x14/0x54
> [<c027390f>] ib_sa_path_rec_get+0x1b/0x13e
> [<c027952f>] resolve_path+0x8c/0x15b
> [<c0278ff2>] path_req_complete+0x0/0xf7
> [<c02a9932>] rtnetlink_dump_all+0x0/0x9e
> [<c02a9a6d>] rtnetlink_done+0x0/0x3
> [<c02799d3>] ib_at_paths_by_route+0xc4/0xd9
> [<c0278aed>] same_path_req+0x0/0x95
> [<c027a53d>] ib_uat_paths_by_route+0xef/0x1c4
> [<c02a9932>] rtnetlink_dump_all+0x0/0x9e
> [<c02a9a6d>] rtnetlink_done+0x0/0x3
> [<c027ac87>] ib_uat_write+0x96/0xa2
> [<c01567fe>] vfs_write+0x108/0x10a
> [<c01568ab>] sys_write+0x41/0x6a
> [<c01035eb>] sysenter_past_esp+0x54/0x75
Hal, I've looked into this more, and this is what appears to be
happening. Ucmpost calls ib_at_route_by_ip(), followed by
ib_at_paths_by_route(). The first call fails asynchronously, which is
ignored by ucmpost. It expects that the call to ib_at_paths_by_route()
to fail synchronously with invalid input.
The AT code in the kernel assumes that the ib_route passed into
ib_at_paths_by_route is valid and dereferences a device pointer, which I
think is causing this crash. Can you confirm that this is what the code
is doing?
The AT code appears to passing a kernel pointer up to the userspace app,
and then requires that pointer to be passed back to the kernel. This
Needs to be changed to pass up some identifier that can be validated on
the return to the kernel.
- Sean
More information about the general
mailing list