<html>

<body>

<font size=3>At 10:14 AM 8/23/2006, Ralph Campbell wrote:<br>

<blockquote type=cite class=cite cite="">On Wed, 2006-08-23 at 09:47

-0700, Caitlin Bestler wrote:<br>

> openib-general-bounces@openib.org wrote:<br>

> > Quoting r. john t <johnt1johnt2@gmail.com>:<br>

> >> Subject: basic IB doubt<br>

> >> <br>

> >> Hi<br>

> >> <br>

> >> I have a very basic doubt. Suppose Host A is doing RDMA

write (say 8<br>

> >> MB) to Host B. When data is copied into Host B's local<br>

> > buffer, is it guaranteed that data will be copied starting<br>

> > from the first location (first buffer address) to the last<br>

> > location (last buffer address)? or it could be in any

order?<br>

> > <br>

> > Once B gets a completion (e.g. of a subsequent send), data

in<br>

> > its buffer matches that of A, byte for byte.<br>

> <br>

> An excellent and concise answer. That is exactly what the

application<br>

> should rely upon, and nothing else. With iWARP this is very

explicit,<br>

> because portions of the message not only MAY be placed out of <br>

> order, they SHOULD be when packets have been re-ordered by the<br>

> network. But for *any* RDMA adapter there is no guarantee on<br>

> what order the adapter flushes things to host memory or

particularly<br>

> when old contents that may be cached are invalidated or

updated.<br>

> The role of the completion is to limit the frequency with which<br>

> the RDMA adapter MUST guarantee coherency with application

visible<br>

> buffers. The completion not only indicates that the entire

message<br>

> was received, but that it has been entirely delivered to host

memory.<br><br>

Actually, A knows the data is in B's memory when A gets the

completion<br>

notice. </blockquote><br>

This is incorrect for both iWARP and IB.  A completion by A only

means that the receiving HCA / RNIC has the data and has generated an

acknowledgement.  It does not indicate that B has flushed the data

to host memory.  Hence, the fault zone remains the HCA / RNIC and

while A may free the associated buffer for other usage, it should not

rely upon the data being delivered to host memory on B.  This is one

of the fault scenarios I raised during the initial RDS transparent

recovery assertions.  If A were to issue a RDMA Read to the B

targeting the associated RDMA Write memory location, then it can know the

data has been placed in B's memory.<br><br>

<blockquote type=cite class=cite cite=""> B can't rely on anything

unless A uses the RDMA write with<br>

immediate which puts a completion event in B's CQ.<br>

Most applications on B ignore this requirement and test for the last<br>

memory location being modified which usually works but doesn't<br>

guarantee that all the data is in memory.</blockquote><br>

B cannot rely on anything until a completion is seen either through an

immediate or a subsequent Send.  It is not wise to rely upon

IHV-specific behaviors when designing an application as even an IHV can

change things over time or due to interoperability requirements, things

may not work as desired which is definitely a customer complaint that

many would like to avoid.<br><br>

BTW, the reason immediate data is 4 bytes in length is that was what was

defined in VIA.  Many within the IBTA wanted to get rid of immediate

data but due to the requirement to support legacy VIA applications, the

immediate value was left in place.  The need to support a larger

value was not apparent.  One needs to keep in mind where the

immediate resides within the wire protocol and its usage model.  The

past usage was to signal a PID or some other unique identifier that could

be used to comprehend which thread of execution should be informed of a

particular completion event.  Four bytes is sufficient to

communicate such information without significantly complicating or making

the wire protocol too inefficient.<br><br>

Mike </font></body>

</html>