<html>

<body>

<font size=3>At 09:50 AM 8/25/2006, Caitlin Bestler wrote:<br>

<blockquote type=cite class=cite cite="">

openib-general-bounces@openib.org wrote:<br>

>>    Thomas> How does an adapter guarantee that

no bridges or other<br>

>>    Thomas> intervening devices reorder their

writes, or for that<br>

>>    Thomas> matter flush them to memory at

all!?<br>

>> <br>

>> That's a good point.  The HCA would have to do a read to

flush the<br>

>> posted writes, and I'm sure it's not doing that (since it would

add<br>

>> horrible latency for no good reason).<br>

>> <br>

>> I guess it's not safe to rely on ordering of RDMA writes after

all.<br>

> <br>

> Couldn't the same point then be made that a CQ entry may come<br>

> before the data has been posted?<br>

> <br><br>

That's why both specs (IBTA and RDMAC) are very explicit that all<br>

prior messages are complete before the CQE is given to the user.<br><br>

It is up to the RDMA Device and/or its driver to guarantee this<br>

by whatever means are appropriate. An implementation that allows<br>

a CQE post to pass the data placement that it is reporting on the<br>

PCI bus is in error.<br><br>

The critical concept of the Work Completion is that it consolidates<br>

guarantees and notificatins. The implementation can do all sorts<br>

of strange things that it thinks optimize *before* the work

completion,<br>

but at the time the work completion is delivered to the user

everything<br>

is supposed to be as expected.<br>

</blockquote><br>

Caitlin's logic is correct and the basis for why these two specifications

call out this issue.  And yes, Roland, one cannot rely upon RDMA

Write ordering whether for IB or iWARP. iWARP specifically allows out of

order delivery.  IB while providing in-order delivery due to its

strong ordering protocol still has no guarantees when it comes to the

memory controller and I/O technology being used.  Given not

everything was expected to operate over PCI, we made sure that the

specifications pointed out these issues so that software would be

designed to accommodate all interconnect attach types and usage

models.  We wanted to maximize the underlying implementation options

while providing software with a consistent operating model to enable it

to be simplified as well.<br><br>

Mike</font></body>

</html>