[ofa-general] Re: [PATCH] ib/mad: fix incorrect access to items on local_list

Dotan Barak dotanb at dev.mellanox.co.il
Wed Dec 5 07:40:21 PST 2007


Hi.


Sean Hefty wrote:
> In cancel_mads(), MADs are moved from the wait_list and local_list
> to a cancel_list for processing.  However, the structures on these two
> lists are not the same.  The wait_list references struct
> ib_mad_send_wr_private, but local_list references struct
> ib_mad_local_private.  Cancel_mads() treats all items moved to the
> cancel_list as struct ib_mad_send_wr_private.  This leads to a system
> crash when requests are moved from the local_list to the cancel_list.
>
> Fix this by leaving local_list alone.  All requests on the local_list
> have completed are just awaiting processing by a queued worker thread.
>
> Bug (crash) reported by Dotan Barak <dotanb at dev.mellanox.co.il>.
> Problem with local_list access reported by Robert Reynolds
> <rreynolds at opengridcomputing.com>.
>
> Signed-off-by: Sean Hefty <sean.hefty at intel.com>
> ---
> This patch is untested.  Dotan, can you see if this fixes the crash that
> you were seeing?
>   
Just want to let me know that i didn't forget about this issue.

I tried to reproduce the failure before applying the bug, but this one 
is not easy to reproduce.

I will give you a feedback as soon as I'll have one ..


Dotan



More information about the general mailing list