[ofw] HCA Description field format differences between Windows & Linux?

Fab Tillier ftillier at microsoft.com
Thu Jun 3 10:34:17 PDT 2010


Smith, Stan wrote on Thu, 3 Jun 2010 at 10:17:54

> Fab Tillier wrote:
>> How do the Linux drivers set this?  Isn't there a local MAD you issue
>> to the HCA to set the node description?
> 
> I don't recollect finding a user-mode pgm to generate a local MAD to
> set NodeDescription nor an IF to HCA drivers to handle said local MAD.
> 
> Sending a local MAD seems like a bit of over-engineering, when the HCA
> driver has access to hostname & node_desc?
> 
> Currently the windows HCA drivers user the .DeviceDesc string from
> their .inf files (mthca.inf or mlx4_hca.inf).

IBAL used to handle the node description query and totally bypass the HCA.  This had several benefits, one being that an SM querying for node description when sweeping the fabric looking for changes, could get a response without having to schedule a passive thread required for the local mad processing.  This can be quite significant since the local mad processing thread competes for CPU cycles with the application threads, whereas the DPC does not.  A properly written HPC app can have long quantums, and a delay in responding to a SM query during a sweep can have drastic repercussions (i.e. the SM removes the node from all the mcast groups it belongs to, believing it is dead.)  This is why I absolutely hate the passive level thread requirement in the HCA driver.  I've been ranting about this, specifically targeting local mad and QP transitions, for 5 years or so now, but still nothing form the HCA driver developers despite new devices and associated drivers coming about.

If we could implement the MKey checks in IBAL, we could go back to eliminating the round trip to the HCA when responding to certain MADs, and we then could format the node description however we wanted.

>> I would mimic whatever the Linux code does for consistency, in how it
>> pushes the node description into the HCA.  Look at
>> __read_machine_name in bus_driver.c for how the hostname was read
>> from the registry.  Heck, you can even just reference node_desc (it's
>> declared as an extern char in al_smi.c)
>> 
>> Perhaps push it into the HCA when the SMI client gets initialized?
> 
> New IF required?

Where in the Linux driver is the 'updated' node description set in a node description query response?  I would think whatever mechanism is used here, would be available on Windows?

-Fab



More information about the ofw mailing list