[ofiwg] getting hardware details from libfabric

Hefty, Sean sean.hefty at intel.com
Tue May 29 10:01:47 PDT 2018


> This is a great start.  More below.

If no one has disagreements, I'll start on a patch.  We can work through the fields from there.

> > On May 24, 2018, at 8:49 PM, Hefty, Sean <sean.hefty at intel.com>
> wrote:
> >
> > This is an initial proposal:
> >
> > The device attributes are per fi_info structure (which includes
> the network address, domain, and ep attributes).  The app does not
> need to open any resources to obtain the data.
> >
> >
> > When fi_getinfo() is called with FI_DEVICE_ATTR flag set, each
> fi_info::handle will reference a struct fid_attr, if valid.
> >
> >
> > // Definition of struct fid_attr.  Field types could change.
> > struct fid_attr {
> > 	struct fid;  /* fclass = FI_TYPE_DEVICE_ATTR */
> > 	struct dev_attr {
> > 		char *interface;
> > 		char *device_id;
> > 		char *device_version;
> > 		char *vendor_id;
> > 		char *firmware;
> > 	};
> 
> It would probably good to have some guidance for what to put in
> these fields.  Nothing too restrictive, but something to help
> vendors/provider authors have some semblance of commonality between
> these fields (I speak as a vendor who got burned on this exact issue
> before ;-) ).
> 
> Here's a first cut:
> 
> Interface: OS device name (if this is a filename, it should be an
> absolute filename) Device ID: ...I'm not clear on how this is
> different than vendor ID?
> Device version: Vendor hardware version
> Vendor ID: Vendor hardware part ID
> Firmware: Vendor firmware version

I should have written down what I thought these could be before starting on a 3 day week-end...  Now I need to remember what these fields were based on...

> Should we add a driver name field?

Probably

> > 	struct bus_attr {
> > 		char *domain_id;
> > 		char *bus_id;
> > 		char *device_id;
> > 		char *function_id;
> > 	};
> 
> Should this be "struct pci_bus_attr", just to future-proof us a bit?
> I.e., more types of busses may be available in the future.

That can work.  This structures will need a little re-work if we want to handle different bus types.

> > 	struct link_attr {
> > 		char *address;
> > 		size_t mtu;

I wonder if mtu would be better off as an enum.  ;)

> > 		enum fi_link_state state; /* up, down, unknown */
> > 		char *protocol;
> > 		size_t speed; /* bits per second */
> > 	};
> 
> What do you envision in the protocol field -- is that the native,
> lowest-layer protocol that this link speaks?  Or is it the protocol
> that the provider speaks?  Or ...?

Since this is the link_attr, the link level protocol.  :)

> Should we indicate the transport layer here?  (Ethernet, IB,
> Omnipath, GNI, ...etc.)

I don't even know what transport protocol means anymore, and I'm not entirely joking.  RoCE is IB transport over UDP, which is also a transport, over which we could layer the utility provider ofi_rxm, so the transport is... ?

libfabric already expose a higher-level protocol field.






More information about the ofiwg mailing list