[openib-general] [PATCH] RDMA/iwcm: Get rid of extra call to list_empty()

Tom Tucker tom at opengridcomputing.com
Mon Nov 13 06:12:19 PST 2006




On 11/12/06 11:30 PM, "Krishna Kumar2" <krkumar2 at in.ibm.com> wrote:

> Hi Tom & Sean,
> 
>>> 
>>> There may be a race here, but...  Why wouldn't the second call into
>>> cm_work_handler simply find the list empty on entry into the call?
>> 
>> Basically, you've got a free work queue element sitting on the iwcm_wq.
> What
>> typically would happen is you'd end up corrupting the list because the
>> cm_event_handler would enqueue the element, it would get freed in
>> cm_work_handler with put_work, then cm_event_handler would call get_work
>> (getting the one just freed that's also sitting on the iwcm_wq list) and
> ...
>> bad things happen.
> 
> Actually I had anticipated this possible problem when I submitted the
> patch, and I
> have explained (given below) why it is not a problem :
> 
> "Doing the redundant queue_work() (if cm_work_handler is already running
> processing the last entry) will not result in another call to
> cm_work_handler
> (run_workqueue) where no entry is found, since cm_work_handler will remove
> all entries from the list, even ones that are added late".
> 
> Isn't that correct ?

No, to understand why go look at the implementation of queue_work. BTW, this
is not a hypothetical race, it's a bug that I fixed.  Sean's approach is the
best, but I need to handle the cleanup of the work_list when the connection
is going away. 

> So if cm_work_handler() is already running and
> processing
> the LAST entry (anything but the last entry will not have an issue as a
> new
> queue_work would not be done by iwcm), it will next find this new entry in
> it's
> current run iteration, and process it. Meanwhile iwcm had done a redundant
> "queue_work()" on this queue, which, besides adding the new entry to the
> workqueue, also does a wakeup of "worker_thread" (which is still running
> the
> previous iteration of run_workqueue -> cm_work_handler). When
> cm_work_handler finishes removing this new entry, it returns to
> worker_thread,
> which will do a schedule() which gets woken up again immediately due to
> the
> redundant "queue_work" done earlier, but then it checks whether the list
> is empty
> and since it is empty, it does another "schedule()" call. So that is what
> I meant by
> saying that another call to cm_work_handler() will NOT result (and where
> that
> redundant call would find no entry to process).
> 
> So I feel this patch is correct in it's original form. Comments or did I
> misunderstand
> the kernel code completely ?
> 
> Thanks,
> 
> - KK
> 
>> 
>>> As an 
>>> alternative, could you defer the list_del_init() call to the end of
> the loop,
>>> which would avoid scheduling cm_work_handler while it's running?
>> 
>> Yeah, that's a good idea.
>> 
>>> 
>>> - Sean
>> 
>> 
> 






More information about the general mailing list