[ofiwg] getting hardware details from libfabric

Weiny, Ira ira.weiny at intel.com
Tue May 29 09:48:33 PDT 2018


> > On May 24, 2018, at 8:49 PM, Hefty, Sean <sean.hefty at intel.com> wrote:
> >
> > This is an initial proposal:
> >
> > The device attributes are per fi_info structure (which includes the network
> address, domain, and ep attributes).  The app does not need to open any
> resources to obtain the data.
> >
> >
> > When fi_getinfo() is called with FI_DEVICE_ATTR flag set, each
> fi_info::handle will reference a struct fid_attr, if valid.
> >
> >
> > // Definition of struct fid_attr.  Field types could change.
> > struct fid_attr {
> > 	struct fid;  /* fclass = FI_TYPE_DEVICE_ATTR */
> > 	struct dev_attr {
> > 		char *interface;
> > 		char *device_id;
> > 		char *device_version;
> > 		char *vendor_id;
> > 		char *firmware;
> > 	};
> 
> It would probably good to have some guidance for what to put in these
> fields.  Nothing too restrictive, but something to help vendors/provider
> authors have some semblance of commonality between these fields (I speak
> as a vendor who got burned on this exact issue before ;-) ).
> 
> Here's a first cut:
> 
> Interface: OS device name (if this is a filename, it should be an absolute
> filename) Device ID: ...I'm not clear on how this is different than vendor ID?
> Device version: Vendor hardware version
> Vendor ID: Vendor hardware part ID
> Firmware: Vendor firmware version
> 
> Should we add a driver name field?
> 
> > 	struct bus_attr {
> > 		char *domain_id;
> > 		char *bus_id;
> > 		char *device_id;
> > 		char *function_id;
> > 	};
> 
> Should this be "struct pci_bus_attr", just to future-proof us a bit?  I.e., more
> types of busses may be available in the future.

Should this be a "void *" to a generic device structure which could then be "struct pci_bus_attr" or some other future bus structure?

> 
> > 	struct link_attr {
> > 		char *address;
> > 		size_t mtu;
> > 		enum fi_link_state state; /* up, down, unknown */
> > 		char *protocol;
> > 		size_t speed; /* bits per second */
> > 	};
> 
> What do you envision in the protocol field -- is that the native, lowest-layer
> protocol that this link speaks?  Or is it the protocol that the provider speaks?
> Or ...?

This caught my eye on Friday but it was Beer-thirty.  ;-)

I'm also not sure "protocol" is sufficient.  There are many devices which support multiple protocols: OmniPath == verbs _and_ PSM, RoCE == sockets (TCP, UDP, HTTP, SSL?) _and_ roce, usNIC == Sockets and libfabric... ?  (what is the underlying "protocol" here?  Does it matter?) ???

I just end up asking myself "what is a protocol"?  There are so many definitions and depending on the level you are discussing it gets pretty confusing.

Should we have a list here?  Or do you get multiple fid_attr for each "protocol"?

What about listing APIs supported?  Verbs, Sockets, whatever GNI, BlueGene, usNIC are?

Jeff does usNIC use the sockets interface?

> 
> Should we indicate the transport layer here?  (Ethernet, IB, Omnipath, GNI,
> ...etc.)

Probably should.  But perhaps that was more what Sean had in mind for "protocol"?

Ira

> 
> > 	struct prov_attr {
> > 		size_t size;
> > 		char data[];
> > 	};
> > };
> >
> >
> > The prov_attr structure can be cast to a provider specific structure (e.g.
> usnic_devinfo) if the app knows the structure format.  I would add a new
> public header for all definitions.  Additionally, fi_tostr() can be used to display
> the attributes.  fid_attr allows fi_tostr() to route the call to the provider to
> handle the prov_attr portion (done using new fi_control FI_TOSTR option).
> >
> > I'm still debating whether fi_freeinfo() will free fid_attr, or if the app must
> call fi_close() separately.
> >
> > - Sean
> > _______________________________________________
> > ofiwg mailing list
> > ofiwg at lists.openfabrics.org
> > http://lists.openfabrics.org/mailman/listinfo/ofiwg
> 
> 
> --
> Jeff Squyres
> jsquyres at cisco.com
> 
> _______________________________________________
> ofiwg mailing list
> ofiwg at lists.openfabrics.org
> http://lists.openfabrics.org/mailman/listinfo/ofiwg


More information about the ofiwg mailing list