[ofw] RE: Ibbus as a filter driver for mthca.

Smith, Stan stan.smith at intel.com
Wed Aug 6 08:46:38 PDT 2008


Fab Tillier wrote:
> Hi Stan,
>
>> Fab Tillier wrote:
>>>>> 2. Attached DevTree images after clean install on a machine with
>>>>> one HCA (mt25204) which has one port.
>>>>>    From the images it seems that ibbus reports on the same port
>>>>>    twice and mthca also report for this port. (When tested on
>>>>>    machine with 2 HCA with 2 port each, 8 child objects were
>>>>>    reported by ibbus and 4 by mlx4_hca) I think that mthca report
>>>>>    is happen since get_relation function is still in
>>>>>    p_ci_interface and mthca replies to PnP messages using this
>>>>> function (which goes to ibbus get_relation). I don't have a clue
>>>>> about ibbus reports.
>>>>
>>>> Interesting, there does seem to be duplication in reporting
>>>> relations. As I remember, mthca uses the
>>>> p_ci_interface->get_relations call only when mthca receives a
>>>> QUERY_REMOVE_DEVICE; removal relations reporting. Since the bus
>>>> driver is correctly reporting bus relations and there exists an
>>>> implicit relation between the bus driver and mthca driver (same
>>>> device stack, mthca cannot unload until it's instance of ibbus
>>>> unloads). Possibly, the mthca driver should not be reporting
>>>> relations like it does; one step closer to simple?  Do you know
>>>> where in mthca relations are reported?
>>>
>>> This could be the problem with the reboot.  An easy work around
>>> without changing mthca is to have the get_relations function return
>>> without updating the child device list (basically turn it into a
>>> noop).
>>
>> In the ibbus:fdo_query_bus_relations() call the
>> DEVICE_QUERY_RELEATIONS irp needs to be completed and not passed
>> down to the HCA driver. When the IRP is passed down to mthca, mthca
>> uses the CI->get_relations upcall to ibbus which then reports the
>> same relations as fdo_query_bus_relations reported earlier, hence
>> the duplication of relations. HCA drivers should not be asked about
>> bus relations.
>
> Completing the request without passing it down is wrong, the whole
> point of these IRPs is to allow each driver in the device stack to
> participate as needed.  The issue with the bus driver being in the
> same device stack as the HCA is that devices now get reported twice -
> once by the bus driver, once by the HCA driver.  The relations
> reported by the HCA driver need to be eliminated, and there's two
> ways of doing this.  One is to make get_relations a noop, and leave
> the HCA driver unchanged.  The other is to remove bus relations
> processing from the HCA driver.  For the time being I would go with
> the first (get_relations is a noop) to minimize the changes.  Once
> this works well, a subsequent changeset can eliminate get_relations
> and the bus relations handling from the HCA drivers.
>
> -Fab

I understand what your saying, although semantically we reach the same point either way - bus relations are reported once. What's the advantage of having the HCA driver do a nop function just to complete an irp?
I will revert the change and pass down the irp to the HCA driver and let it nop and irp complete.

BTW, when I do this the HCA driver will not unload do to 4 lingering references on the hca_dev_ext_t p_ext->usecnt. 'usecnt' is decremented in hca_verbs.c:mlnx_um_close().
Not clear at this juncture as to who has opened the hca and not closed it. When loading mthca.inf I've cancelled the IPoIB driver loads so it's somewhat of a mystery to date.





More information about the ofw mailing list