[Ofvwg] OFVWG meeting notes (11/10/2015)

Liran Liss liranl at mellanox.com
Tue Nov 10 12:41:52 PST 2015


Matan Barak has presented an update on the latest work on time-stamps and RoCE address management and RoCEv2 support.

The time stamp patches allow the HCA to report to the consumer a high-precision time stamp of any data path operation such as sends and receives, rdma read and writes, atomics, and Raw Ethernet packet delivery.

For reliable transports, with potentially multi-packet messages, the time stamp is taken as follows.
On the sender's side, the time stamp is generated upon receiving the acknowledgement that the whole message was delivered, possibly after retransmissions.
On the receiver's side, the time stamp is generated when the whole message is received.
For unreliable, single packet messages, such as in UD or Raw Ethernet, the time stamp is taken upon sending or receiving a packet.

The kernel patches were accepted. Recently, user-space support was submitted for review.
The user-space APIs allow to enable time stamping for a given CQ.

To potentially poll different kinds of time stamps while keeping the size of the CQE small, a new framework was introduced to allow the application to determine the information returned by completions.
This is a generic mechanism, which may apply to any piece of information returned in the CQE and not just to timestamps.
This way, the application may get only the fields that it is interested in while maintaining a minimal memory footprint.
The CQE data is ordered according to field sizes. Multiple extended CQEs could be polled at once; each CQE is 64 bit aligned.

Currently, time stamps are returned as a sampling of a running counter in the HCA.
The HCA running counter is returned by a new Verb, which may be used in the future to return multiple values, such as temperature readings.
Adding a conversion Verb to correlate a counter reading with system time is also considered.

Time stamping looks like a useful feature. One use-case is for financial applications, which need to keep track of network transactions.
Another use case is HPC. Here, time stamps may be used by the communication libraries to evaluate the conditions of different network paths my tracking their latency.


The RoCEv2 support is provided by 3 patch-sets.
The first patch-set moves RoCE address management to the IB core (provider-independent) GID cache and adds a reference to the network interface in each GID entry.
The second patch-set make use of the new GID cache in the IB stack. The Interface L2 attributes can be derived directly from the GID instead of passing them again in QP attributes.

These 2 patch-sets were accepted upstream.
In addition to simplifying RoCE address management, these patches make the task of extending the recent work on RDMA container support to RoCE relatively straightforward.

The third patch-set adds a GID type attribute to each GID entries, and is still under review.

--Liran

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/ofvwg/attachments/20151110/57ce7f02/attachment.html>


More information about the ofvwg mailing list