[ofw] crash in mlx4 driver

Fab Tillier ftillier at windows.microsoft.com
Mon Mar 16 11:23:03 PDT 2009


>>> I've added this field (p_hca_obj) for WinVerbs, because it needs it
>>> now for performing QueryCa. Before that WinVerbs was using p_hca_dev
>>> field, which was wrong, because this field contains HCA's PDO.
>>
>> As Fab pointed out, the field had a dual use.
>
> I do not quite agree with that. As far as I remember, it was used only
> as PDO before WinVerbs\WinMad came.

Yes, before WinVerbs\WinMad, the field had a single use - reporting the PDO of the HCA to IBAL.  The RDMA_INTERFACE_VERBS overloaded the field to return the existing internal open CA instance, so that IBAL and WinVerbs share the same open CA.


>> It was set to
>> the HCA object (and not the PDO), when responding to the
>> query interface call.  (Path of minimal
>> change.)
>
> Once more, my believe that it was WinVerbs\WinMad, that stored in this
> field the pointer to an internal structure of HCA driver.
> They should have gotten this pointer by calling open_ca like IBAL does
> in create_ci_ca().

The problem is that if open_ca was designed and implemented with a single client in mind.  WinVerbs doesn't want to override the async event callback for IBAL.

>>> In fact, WinVerbs can perform QueryCa using the existing interface
>>> like IBAL does. If Sean will change WinVerbs to do that, he can remove
>>> this field as unnecessary.
>>
>> I'm not following this.  I defined the RDMA_INTERFACE_VERBS
>> structure for winverbs.  What do you mean by 'perform
>> QueryCa'?  Do you mean obtain this interface?
>  I mean, that WinVerbs\WinMad were using this field (p_hca_dev) to
> perform QueryCa. But QueryCa requires CA handle, so this field was
> overwritten with the value of CA handle, which made it to have the "dual
> usage". The right way was not to touch this field at all, but to perform
> OpenCa, which returns the needed handle. I added a new field (p_hca_obj)
> for QueryCa, leaving this change in WinVerbs\WinMad for you. But if you
> agree with the above and have no time for that, I can change
> WinVerbs\WinMad and remove this field back. It is really unnecessary.

There's nothing wrong with overloading the p_hca_dev, it should have been better documented.  WinVerbs has no use for the PDO, so returning it isn't useful.

You'd have to extend open_ca to indicate that all you want returned is the handle, without overwriting the context/async callback to allow WinVerbs to call it.

>> Now that IBAL is in the same device stack as the HCA, does it
>> need the PDO exchanged through the ci_interface_t?
>>
> I believe - no, but it is used now in bus_port_mgr, bus_iou_mgr and
> IBAL. If we store PDO in IBAL CA object, we can remove it from the
> interface. Fab ?

Yes, you should be able to remove the PDO and figure it out internally without a problem.  I believe the work on the bus driver is already addressing this.

>> My preference is to undo the change to struct
>> RDMA_INTERFACE_VERBS, but if it must be kept for some reason,
>> then at the very least the GUID should be updated.
>> This would cause incompatible builds of the drivers to fail
>> to obtain the interfaces.

Longer term I think it would make sense to extend open_ca to not set the asyn callback if it is NULL, to allow clients to retrieve the HCA handle. Coupled with making IBAL and the bus driver find the PDO on their own would allow removing the p_hca_dev member of the CI interface completely, which would be a cleaner solution.

-Fab



More information about the ofw mailing list