<FONT face="Default Sans Serif,Verdana,Arial,Helvetica,sans-serif" size=2><x>Hi,<br><br>I am trying to figure out how efficient MR registration followed by an RDMA write is.<br>For that matter I am running the following loop:<br><br>// create MR of size 64KB<br><br>for (i = 0; i < max_writes; i++) {<br><br> // destroy old MR<br><br> // create MR of size 64KB<br><br> // RDMA write from new MR to some remote buffer<br><br>}<br><br><br>At some point (varying) I get the following error:<br><br>iwch_ev_dispatch - CQE Err qpid 0x3d00 opcode 0 status 0x1 type 1 wrid.hi 0xb3 wrid.lo 0x0 <br>post_qp_event - AE qpid 0x3d00 opcode 0 status 0x1 type 1 wrid.hi 0xb3 wrid.lo 0x0 <br><br>...which basically tells me that the egress (type 1) RDMA write (opcode 0) has failed du to an invaild STag<br>(status 0x1 = STAG invalid: either the STAG is offlimit, being 0 or STAG_key mismatch).<br><br>The error occurs at ibv_post_send().<br><br>Here is a trace of the WRs posted shortly before the 'crash':<br><br>wr_id=178<br>loc_addr=0x2aaaab64f010<br>loc_len=65536<br>lkey=4552191<br>num_sge=1<br>rem_addr=0x2aaaab5d0010<br>rkey=1459967<br><br>wr_id=179<br>loc_addr=0x2aaaab65f010<br>loc_len=65536<br>lkey=4555263<br>num_sge=1<br>rem_addr=0x2aaaab5e0010<br>rkey=1459967<br><br>ASYNC_EVENT: [QP] Local access violation error<br>wr_id=180<br>loc_addr=0x2aaaab66f010<br>loc_len=65536<br>lkey=4555519<br>num_sge=1<br>rem_addr=0x2aaaab5f0010<br>rkey=1459967<br>ERROR: [rdma_write] failed to post rdma write wr<br>ERROR: rdma write (180/1000) failed<br><br><br>Do you have any idea what could be happening here? I noticed that if I do signaled writes and wait for each<br>individual completion, this does not happen. It is also not an issue when posting RDMA writes of size </x>32KB.<br>When using 64KB or larger this happens... but why? I assume that as soon as ibv_reg_mr() returns I am free<br>to use the MR, right?<br><br>Many thanks for your advice,<br> Phil<br></FONT>