[ofa-general] SDP - How to?

Amir Vadai amirv at mellanox.co.il
Wed Jun 10 23:22:14 PDT 2009


Hi Zafar,

I am the maintainer of SDP in OFED.

The current SDP implementation does not have ZCopy. What we have is 
BCopy and something that we call BZcopy.
Small packets (<=32k) are sent using BCopy - meaning the SEND verb is in 
use and data is not zero copied.
Packets bigger than 32K are sent using BZcopy - they are also sent using 
the SEND verb but they are zero copied on the sender side. on the RX 
side packets are memcpy'ed.
When we implement ZCopy it will be automatic too - when sending packets 
bigger than a threshold the driver will automatically use RDMA verb 
instead of the SEND.

BTW, your numbers are different that what we measured. See attached. Was 
measured on a high end host using QDR IB PCI gen 2 and Mellanox ofed 1.4.

- Amir

On 06/10/2009 05:47 PM, Zafar Gilani wrote:
> Yes of course good question. Well the reason is that SDP allows two
> approaches: BCopy (Buffered Copy) and ZCopy (Zero Copy or RDMA). BCopy
> as I described is simple to use (change AF_INET to AF_INET_SDP in
> socket() call and you are done, no code changes). Whereas ZCopy exploits
> the RDMA capability of InfiniBand. In contrast IPoIB just provides IP
> protocol encapsulation over InfiniBand.
>
> You can look at the benchmark graphs I generated at
> [http://hpc.niit.edu.pk/~zafar/work/results_rocks/index.html] by
> executing MPJ Express over IPoIB and SDP/BCopy. Now I want to implement
> SDP/ZCopy for MPJ Express and I do not know how to do this. This is the
> primary reason for my questions :). You can say this is sort of enabling
> MPJ Express to work over wide range of IB protocols.
>
> More information on MPJ Express is available at: http://mpj-express.org/
>
> Thank you,
> Zafar
>
> On Wed, Jun 10, 2009 at 4:07 PM, Zafar Gilani<zafargilani at gmail.com
> <mailto:zafargilani at gmail.com>> wrote:
>  > I read a paper on "Zero Copy Sockets Direct Protocol over
> InfiniBand". I had
>  > a few questions, if anybody could answer I will be thankful.
>  >
>  > 1. SDP BCopy approach can be used without the change of the code via
> the use
>  > of AF_INET_SDP parameter when calling socket() system call. But how can I
>  > implement SDP ZCopy approach if I want to implement ZCopy_Read and
>  > ZCopy_Write methods?
>  >
>  > 2. Is there any documentation regarding how to implement SDP ZCopy? Any
>  > Hello World sort of example code that I can use to learn how to work with
>  > this?
>  >
>  > 3. SDP BCopy is better for smaller messages, but how small? Is 64K a good
>  > threshold to change from BCopy to ZCopy at 128K?
>
> May I ask you why you are looking at SDP ? The performance of IPoIB in
> connected mode and with large MTU is close to that of SDP, while the
> former is a lot easier to use than the latter.
>
> Bart.
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

-- 
Amir Vadai
Software Eng.
Mellanox Technologies
mailto: amirv at mellanox.co.il
Tel +972-3-6259539
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sdp_vs_ip_vs_10ge.png
Type: image/png
Size: 29636 bytes
Desc: not available
URL: <http://lists.openfabrics.org/pipermail/general/attachments/20090611/10061f1b/attachment.png>


More information about the general mailing list