[ofiwg] getting hardware details from libfabric

Jeff Squyres (jsquyres) jsquyres at cisco.com
Tue May 29 06:05:46 PDT 2018


This is a great start.  More below.


> On May 24, 2018, at 8:49 PM, Hefty, Sean <sean.hefty at intel.com> wrote:
> 
> This is an initial proposal:
> 
> The device attributes are per fi_info structure (which includes the network address, domain, and ep attributes).  The app does not need to open any resources to obtain the data.
> 
> 
> When fi_getinfo() is called with FI_DEVICE_ATTR flag set, each fi_info::handle will reference a struct fid_attr, if valid.
> 
> 
> // Definition of struct fid_attr.  Field types could change.
> struct fid_attr {
> 	struct fid;  /* fclass = FI_TYPE_DEVICE_ATTR */
> 	struct dev_attr {
> 		char *interface;
> 		char *device_id;
> 		char *device_version;
> 		char *vendor_id;
> 		char *firmware;
> 	};

It would probably good to have some guidance for what to put in these fields.  Nothing too restrictive, but something to help vendors/provider authors have some semblance of commonality between these fields (I speak as a vendor who got burned on this exact issue before ;-) ).

Here's a first cut:

Interface: OS device name (if this is a filename, it should be an absolute filename)
Device ID: ...I'm not clear on how this is different than vendor ID?
Device version: Vendor hardware version
Vendor ID: Vendor hardware part ID
Firmware: Vendor firmware version

Should we add a driver name field?

> 	struct bus_attr {
> 		char *domain_id; 
> 		char *bus_id;
> 		char *device_id;
> 		char *function_id;
> 	};

Should this be "struct pci_bus_attr", just to future-proof us a bit?  I.e., more types of busses may be available in the future.

> 	struct link_attr {
> 		char *address;
> 		size_t mtu;
> 		enum fi_link_state state; /* up, down, unknown */
> 		char *protocol;
> 		size_t speed; /* bits per second */
> 	};

What do you envision in the protocol field -- is that the native, lowest-layer protocol that this link speaks?  Or is it the protocol that the provider speaks?  Or ...?

Should we indicate the transport layer here?  (Ethernet, IB, Omnipath, GNI, ...etc.)

> 	struct prov_attr {
> 		size_t size;
> 		char data[];
> 	};
> };
> 
> 
> The prov_attr structure can be cast to a provider specific structure (e.g. usnic_devinfo) if the app knows the structure format.  I would add a new public header for all definitions.  Additionally, fi_tostr() can be used to display the attributes.  fid_attr allows fi_tostr() to route the call to the provider to handle the prov_attr portion (done using new fi_control FI_TOSTR option).
> 
> I'm still debating whether fi_freeinfo() will free fid_attr, or if the app must call fi_close() separately.
> 
> - Sean
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg


-- 
Jeff Squyres
jsquyres at cisco.com




More information about the ofiwg mailing list