[openib-general] [PATCH] RDMA/iwcm: Get rid of extra call to list_empty()

Tom Tucker tom at opengridcomputing.com
Fri Nov 10 10:50:00 PST 2006




On 11/10/06 11:32 AM, "Sean Hefty" <mshefty at ichips.intel.com> wrote:

> Tom Tucker wrote:
>> This patch adds a race that causes an event to be processed twice.
>> 
>> Each cm_id has a list of pending work stuck on cm_id_priv->work_list.
>> When new work is created, cm_event_handler needs to know whether it
>> needs to schedule a call to cm_work_handler by enqueing work on the
>> iwcm_wq or whether it can just append it to the list of work already
>> scheduled. 
>> 
>> The race is that the cm_work_handler removes the last element on the
>> list (leaving it empty) and then releases the lock. The cm_event_handler
>> checks the list, finds it empty and so schedules another call to
>> cm_work_handler (which is still running). cm_work_handler processes the
>> work (previous end of list) and then takes the lock again, checks the
>> list and voila, finds the work just added by cm_event_handler, so he
>> processes it. But the work is still queued on the iwcm_wq. So when the
>> cm_work_handler runs again, it processes that same work element a second
>> time -- causing all kinds of misadventure.
> 
> There may be a race here, but...  Why wouldn't the second call into
> cm_work_handler simply find the list empty on entry into the call?

Basically, you've got a free work queue element sitting on the iwcm_wq. What
typically would happen is you'd end up corrupting the list because the
cm_event_handler would enqueue the element, it would get freed in
cm_work_handler with put_work, then cm_event_handler would call get_work
(getting the one just freed that's also sitting on the iwcm_wq list) and ...
bad things happen. 

> As an 
> alternative, could you defer the list_del_init() call to the end of the loop,
> which would avoid scheduling cm_work_handler while it's running?

Yeah, that's a good idea.

> 
> - Sean






More information about the general mailing list