[ofiwg] the send credit thread that will not die

Mon Oct 20 13:43:34 PDT 2014

> - sockets: # of bytes in sockbuf
> - Infiniband: # of pending operations == length of queue
> - usNIC: # of pending SGEs == length of queue
> - meta-endpoint, constructed from multiple tx/rx contexts: ??
> 
> Others with same or different characteristics?

Conceptually, and I think this actually applies to usNIC based on examining the code, the length of the queue may also be the # pending operations + # pending SGEs.

> For usNIC, Infiniband, and possibly others a fair question is "why would
> you ever want to allocate anything less than maximum queue size?"  Host
> memory constraints is all I can really think of, so from app writer
> perspective maybe total host memory consumed is all you really care about.
> (which is perhaps what you were saying when you started talking about
> "bytes" rather than "credits"?).

The performance can suffer if the maximum size is allocated, but unused.  It can affect the number of operations that fit into a single cache line or page, for example.  I've seen this on existing HCAs, where increasing the inline data size greatly reduced the bandwidth.

> I could see "size" being the total number of size-dependent bytes needed
> to support a given set of EP charastics (e.g. for usnic it's queue len +
> some ancillary structures per queue entry, but not the one-time QP struct
> needed by every endpoint, regardless of size).  The provider would report
> this "size" for its default (likely maximal, but not necessarily) EP
> configuration, which the app could then adjust if memory pressure becomes
> an issue.  Given a different requested size, the provider would make
> adjustments to queue configuration to optimally make use of the available
> memory (by adjusting queue depth or per-entry copy buffer size or whatever
> is best for that provider)
> 
> In the default case, the app write should really not have to care about EP
> size in terms of queue entries or SGEs.  By rolling these into a total
> "size" in bytes, sockbuf sizes and QPs lengths can be made to smell about
> the same to the user.

This is what the current API attempts.  Size is just some amount of resources that a transmit or receive context consumes.  Unfortunately, the size _may_ eventually be divided up among different competing variables, such as either many outstanding operations or a lot of SGEs or a larger inject/immediate size.

For the CQ, I went with a simple count, since from the perspective of the API, the size of each entry is known.