Going below the specmanship, the source of ambiguity comes<br>

down to whether the RDMA device checks the consume pointer<br>

before writing a CQE.<br>

<br>

Not checking it means that overflow is either undetectable, or<br>

only detected after arbitrary unknown CQEs have been erased.<br>

In the case where an unknown CQE was erased every QP<br>

that feeds the CQ is at risk.<br>

<br>

But if the RDMA device checks the consume pointer before<br>

writing then the only CQE that can be lost is the one that<br>

is being generated. That QP is known. It is known that no<br>

other QPs have been damaged.<br>

<br>

The two designs reflect different approaches to fault tolerance.<br>

One states a constraint on the application, which if followed<br>

can prevent CQ overflows. Since any CQ overflow represents<br>

a failure of the Consumer to comply with the contract the<br>

RDMA device is under no obligation to waste a single <br>

flip-flop or line of code to try to minimize the damage,<br>

except for damage to third parties (hence the RDMAC<br>

constraint that QPs using different CQs are not damaged).<br>

<br>

The second views a CQ overflow on the same terms<br>

as a divide by zero or many other errors that should<br>

not happen -- you confine the damage and leave as<br>

much of the system running as possible.<br>

<br>

Given that both design approaches are valid it is not<br>

surprising that both IB and iWARP verb specifications<br>

an be construed to be compatible with either design.<br>

<br>

<br>

<br>