[ofiwg] DS/DA Agenda for Tuesday, 6/21/16

Dave Goodell (dgoodell) dgoodell at cisco.com
Tue Jun 21 08:36:48 PDT 2016


On Jun 21, 2016, at 2:57 AM, Paul Grun <grun at cray.com> wrote:
> 
> Continue review of Intel’s proposed extensions to libfabric for RDMA.
> Attached is an update of the document published by Chet Douglas incorporating comments from our last meeting.

On today's call the specific libfabric ordering details we were discussing are documented in the fi_endpoint(3) man page: https://ofiwg.github.io/libfabric/v1.3.0/man/fi_endpoint.3.html

Specifically, here are some relevant excerpts:

----8<----
Max RMA Ordered Size

The maximum ordered size specifies the delivery order of transport data into target memory for RMA and atomic operations. Data ordering is separate, but dependent on message ordering (defined below). Data ordering is unspecified where message order is not defined.

Data ordering refers to the access of target memory by subsequent operations. When back to back RMA read or write operations access the same registered memory location, data ordering indicates whether the second operation reads or writes the target memory after the first operation has completed. Because RMA ordering applies between two operations, and not within a single data transfer, ordering is defined per byte-addressable memory location. I.e. ordering specifies whether location X is accessed by the second operation after the first operation. Nothing is implied about the completion of the first operation before the second operation is initiated.

In order to support large data transfers being broken into multiple packets and sent using multiple paths through the fabric, data ordering may be limited to transfers of a specific size or less. Providers specify when data ordering is maintained through the following values. Note that even if data ordering is not maintained, message ordering may be.

max_order_raw_size
Read after write size. If set, an RMA or atomic read operation issued after an RMA or atomic write operation, both of which are smaller than the size, will be ordered. Where the target memory locations overlap, the RMA or atomic read operation will see the results of the previous RMA or atomic write.
max_order_war_size
Write after read size. If set, an RMA or atomic write operation issued after an RMA or atomic read operation, both of which are smaller than the size, will be ordered. The RMA or atomic read operation will see the initial value of the target memory location before a subsequent RMA or atomic write updates the value.
max_order_waw_size
Write after write size. If set, an RMA or atomic write operation issued after an RMA or atomic write operation, both of which are smaller than the size, will be ordered. The target memory location will reflect the results of the second RMA or atomic write.
An order size value of 0 indicates that ordering is not guaranteed. A value of -1 guarantees ordering for any data size.

[...]

msg_order - Message Ordering

Message ordering refers to the order in which transport layer headers (as viewed by the application) are processed. Relaxed message order enables data transfers to be sent and received out of order, which may improve performance by utilizing multiple paths through the fabric from the initiating endpoint to a target endpoint. Message order applies only between a single source and destination endpoint pair. Ordering between different target endpoints is not defined.

Message order is determined using a set of ordering bits. Each set bit indicates that ordering is maintained between data transfers of the specified type. Message order is defined for [read | write | send] operations submitted by an application after [read | write | send] operations.

Message ordering only applies to the end to end transmission of transport headers. Message ordering is necessary, but does not guarantee, the order in which message data is sent or received by the transport layer. Message ordering requires matching ordering semantics on the receiving side of a data transfer operation in order to guarantee that ordering is met.

FI_ORDER_NONE
No ordering is specified. This value may be used as input in order to obtain the default message order supported by the provider.
FI_ORDER_RAR
Read after read. If set, RMA and atomic read operations are transmitted in the order submitted relative to other RMA and atomic read operations. If not set, RMA and atomic reads may be transmitted out of order from their submission.
FI_ORDER_RAW
Read after write. If set, RMA and atomic read operations are transmitted in the order submitted relative to RMA and atomic write operations. If not set, RMA and atomic reads may be transmitted ahead of RMA and atomic writes.
FI_ORDER_RAS
Read after send. If set, RMA and atomic read operations are transmitted in the order submitted relative to message send operations, including tagged sends. If not set, RMA and atomic reads may be transmitted ahead of sends.
FI_ORDER_WAR
Write after read. If set, RMA and atomic write operations are transmitted in the order submitted relative to RMA and atomic read operations. If not set, RMA and atomic writes may be transmitted ahead of RMA and atomic reads.
FI_ORDER_WAW
Write after write. If set, RMA and atomic write operations are transmitted in the order submitted relative to other RMA and atomic write operations. If not set, RMA and atomic writes may be transmitted out of order from their submission.
FI_ORDER_WAS
Write after send. If set, RMA and atomic write operations are transmitted in the order submitted relative to message send operations, including tagged sends. If not set, RMA and atomic writes may be transmitted ahead of sends.
FI_ORDER_SAR
Send after read. If set, message send operations, including tagged sends, are transmitted in order submitted relative to RMA and atomic read operations. If not set, message sends may be transmitted ahead of RMA and atomic reads.
FI_ORDER_SAW
Send after write. If set, message send operations, including tagged sends, are transmitted in order submitted relative to RMA and atomic write operations. If not set, message sends may be transmitted ahead of RMA and atomic writes.
FI_ORDER_SAS
Send after send. If set, message send operations, including tagged sends, are transmitted in the order submitted relative to other message send. If not set, message sends may be transmitted out of order from their submission.
----8<----

-Dave



More information about the ofiwg mailing list