[ofiwg] DS/DA Agenda for Tuesday, 6/21/16
Douglas, Chet R
chet.r.douglas at intel.com
Tue Jun 21 09:29:58 PDT 2016
Thanks! I will make sure to review this and update the proposal to utilize these mechanisms. The intent was that the proposed extensions would have as little impact on current libfabric API as possible. I will send out a new document AFTER our next review in two weeks. For now the version that Paul sent out earlier today is the current working version.
I encourage everyone to read section 5 ahead of the next meeting. This section covers the most important open issues and I want to make sure we talk about them in detail.
-----Original Message-----
From: ofiwg [mailto:ofiwg-bounces at lists.openfabrics.org] On Behalf Of Dave Goodell (dgoodell)
Sent: Tuesday, June 21, 2016 8:37 AM
To: Paul Grun <grun at cray.com>
Cc: ofiwg at lists.openfabrics.org
Subject: Re: [ofiwg] DS/DA Agenda for Tuesday, 6/21/16
On Jun 21, 2016, at 2:57 AM, Paul Grun <grun at cray.com> wrote:
>
> Continue review of Intel’s proposed extensions to libfabric for RDMA.
> Attached is an update of the document published by Chet Douglas incorporating comments from our last meeting.
On today's call the specific libfabric ordering details we were discussing are documented in the fi_endpoint(3) man page: https://ofiwg.github.io/libfabric/v1.3.0/man/fi_endpoint.3.html
Specifically, here are some relevant excerpts:
----8<----
Max RMA Ordered Size
The maximum ordered size specifies the delivery order of transport data into target memory for RMA and atomic operations. Data ordering is separate, but dependent on message ordering (defined below). Data ordering is unspecified where message order is not defined.
Data ordering refers to the access of target memory by subsequent operations. When back to back RMA read or write operations access the same registered memory location, data ordering indicates whether the second operation reads or writes the target memory after the first operation has completed. Because RMA ordering applies between two operations, and not within a single data transfer, ordering is defined per byte-addressable memory location. I.e. ordering specifies whether location X is accessed by the second operation after the first operation. Nothing is implied about the completion of the first operation before the second operation is initiated.
In order to support large data transfers being broken into multiple packets and sent using multiple paths through the fabric, data ordering may be limited to transfers of a specific size or less. Providers specify when data ordering is maintained through the following values. Note that even if data ordering is not maintained, message ordering may be.
max_order_raw_size
Read after write size. If set, an RMA or atomic read operation issued after an RMA or atomic write operation, both of which are smaller than the size, will be ordered. Where the target memory locations overlap, the RMA or atomic read operation will see the results of the previous RMA or atomic write.
max_order_war_size
Write after read size. If set, an RMA or atomic write operation issued after an RMA or atomic read operation, both of which are smaller than the size, will be ordered. The RMA or atomic read operation will see the initial value of the target memory location before a subsequent RMA or atomic write updates the value.
max_order_waw_size
Write after write size. If set, an RMA or atomic write operation issued after an RMA or atomic write operation, both of which are smaller than the size, will be ordered. The target memory location will reflect the results of the second RMA or atomic write.
An order size value of 0 indicates that ordering is not guaranteed. A value of -1 guarantees ordering for any data size.
[...]
msg_order - Message Ordering
Message ordering refers to the order in which transport layer headers (as viewed by the application) are processed. Relaxed message order enables data transfers to be sent and received out of order, which may improve performance by utilizing multiple paths through the fabric from the initiating endpoint to a target endpoint. Message order applies only between a single source and destination endpoint pair. Ordering between different target endpoints is not defined.
Message order is determined using a set of ordering bits. Each set bit indicates that ordering is maintained between data transfers of the specified type. Message order is defined for [read | write | send] operations submitted by an application after [read | write | send] operations.
Message ordering only applies to the end to end transmission of transport headers. Message ordering is necessary, but does not guarantee, the order in which message data is sent or received by the transport layer. Message ordering requires matching ordering semantics on the receiving side of a data transfer operation in order to guarantee that ordering is met.
FI_ORDER_NONE
No ordering is specified. This value may be used as input in order to obtain the default message order supported by the provider.
FI_ORDER_RAR
Read after read. If set, RMA and atomic read operations are transmitted in the order submitted relative to other RMA and atomic read operations. If not set, RMA and atomic reads may be transmitted out of order from their submission.
FI_ORDER_RAW
Read after write. If set, RMA and atomic read operations are transmitted in the order submitted relative to RMA and atomic write operations. If not set, RMA and atomic reads may be transmitted ahead of RMA and atomic writes.
FI_ORDER_RAS
Read after send. If set, RMA and atomic read operations are transmitted in the order submitted relative to message send operations, including tagged sends. If not set, RMA and atomic reads may be transmitted ahead of sends.
FI_ORDER_WAR
Write after read. If set, RMA and atomic write operations are transmitted in the order submitted relative to RMA and atomic read operations. If not set, RMA and atomic writes may be transmitted ahead of RMA and atomic reads.
FI_ORDER_WAW
Write after write. If set, RMA and atomic write operations are transmitted in the order submitted relative to other RMA and atomic write operations. If not set, RMA and atomic writes may be transmitted out of order from their submission.
FI_ORDER_WAS
Write after send. If set, RMA and atomic write operations are transmitted in the order submitted relative to message send operations, including tagged sends. If not set, RMA and atomic writes may be transmitted ahead of sends.
FI_ORDER_SAR
Send after read. If set, message send operations, including tagged sends, are transmitted in order submitted relative to RMA and atomic read operations. If not set, message sends may be transmitted ahead of RMA and atomic reads.
FI_ORDER_SAW
Send after write. If set, message send operations, including tagged sends, are transmitted in order submitted relative to RMA and atomic write operations. If not set, message sends may be transmitted ahead of RMA and atomic writes.
FI_ORDER_SAS
Send after send. If set, message send operations, including tagged sends, are transmitted in the order submitted relative to other message send. If not set, message sends may be transmitted out of order from their submission.
----8<----
-Dave
_______________________________________________
ofiwg mailing list
ofiwg at lists.openfabrics.org
http://lists.openfabrics.org/mailman/listinfo/ofiwg
More information about the ofiwg
mailing list