<br><br>
<div>
<blockquote class="gmail_quote" style="PADDING-LEFT: 1ex; MARGIN: 0px 0px 0px 0.8ex; BORDER-LEFT: #ccc 1px solid">One more thing: Did you use the default message size of the tests?<br> In ibv_ud_pingpong the default message size is 2K
<br> In ibv_rc_pingpong the default message size is 4K<br><br>so, 2 posts and 2 completions where handled in UD for every 1 post and 1<br>completion in RC ...</blockquote>
<div> </div>
<div> </div>
<div>I used default message size. ibv_rc_pingpong with message size set to 2K gives same reading as ibv_ud_pingpong and with increasing message size gives better results.</div>
<div> </div>
<div>So does it mean that more number of posts and completions hurt the perforrance?? Is there a way to minimize number of posts/completions in UD ??</div>
<div> </div>
<div>send_bw shows that UD and RC give almost same performance when n = 1000 iterations but for few iterations (say 2) UD is good. </div>
<div> </div>
<div>Basically I am doing some experiments with Broadcast and my readings show that for large data sizes the performance is not good. Given that switch is able to route at high rate, I think the reason for low performance boils down to UD being not able to handle more then 2K message size. Is it possible to have something like RDMA for UD and hence for broadcast (although IB spec does not suppot it) or have hardware do the fragmentation for UD if size specified is more then 2K ??
</div><br>Regards,</div>
<div>John T.</div>