[ewg] [PATCH]IPOIB/CM fix for bug# 906 -OFED-1.3

Stefan Roscher ossrosch at linux.vnet.ibm.com
Wed Feb 13 02:08:56 PST 2008


On Wednesday 13 February 2008 09:04:53 Or Gerlitz wrote:
> Pradeep Satyanarayana wrote:
> > This patch fixes -fail to destroy ipoib rx QP (https://bugs.openfabrics.org/show_bug.cgi?id=906)
> > Hence the usecnt issue reported previously on ehca is solved and allows the qp to be destroyed.
> > 
> > As per Eli's request, I am splitting up the patches. This is first portion of yesterday's patch.
> > Tested on ppc64 machines with ehca and mthca.
> 
> Also here, does this problem exist in the 2.6.25-rc1 upstream code as 
> well? from the change log I don't understand the source of the problem 
> (only the symptom of failing to destroy ipoib/cm rx QP) and the solution.
> 
> Or.

Hi,
yes this problem does also exist in 2.6.25-rc1. It was introduced by a patch from roland:
http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=efcd99717f76c6d19dd81203c60fe198480de522

In function ipoib_cm_dev_stop() the error-,drain- and flush lists are put into a local list after a timeout.
In the past there was a list_for_each_entry loop iterating over this local list and destroyed all added QPs. 
With the patch above the list_for_each_entry call is moved to function ipoib_cm_free_rx_reap_list(),
which does not iterate the former local list, but device's reap_list.
Pradeeps patch puts now all QPs after a timeout from error, drain and flush lists into the reap_list so that they were all freed in poib_cm_free_rx_reap_list().

Stefan



More information about the ewg mailing list