[ofa-general] Performance of UDAPL RDMA vs IB verbs

Arlin Davis ardavis at ichips.intel.com
Thu Feb 14 16:45:55 PST 2008


Arlin Davis wrote:
> Chuck Hartley wrote:
>> We are doing performance measurements on an application that is using 
>> uDAPL RDMA reads for some large transfers and the BW is less than we 
>> expected.  The transfers are 4MB and we are seeing BW of 930MiB/sec 
>> (DDR).  When we do the same transfer size using ib_read_bw we get 1475 
>> MB/sec.  On a pair of machines with SDR interfaces, we get 697MiB/sec 
>> and 918MB/sec respectively.

Here is a quick comparison of verbs, rdma_cm, and uDAPL using
ib_read_bw, ib_write_bw, ib_rdma_bw -c, and dapltest. My
results are very close using default size of 65536.

1.) IB verbs:  ib_read_bw, ib_write_bw
------------------------------------------------------------------
                     RDMA_Read BW Test
------------------------------------------------------------------
  #bytes #iterations    BW peak[MB/sec]    BW average[MB/sec]
   65536        10000            1331.10               1329.85
------------------------------------------------------------------
------------------------------------------------------------------
                     RDMA_Write BW Test
------------------------------------------------------------------
  #bytes #iterations    BW peak[MB/sec]    BW average[MB/sec]
   65536        10000            1423.43               1422.64
------------------------------------------------------------------

2.) RDMA_CM + verbs: ib_rdma_bw -c -n100000

10711: Bandwidth peak (#0 to #7603): 1428.73 MB/sec
10711: Bandwidth average: 1428.13 MB/sec

3.) DAPL + RDMA_CM + verbs:

dapltest -T P -m p -s cst-50-ib0 -i 10000 RW 65535

RDMA_WRITES:
       Total Time           : 4.55 sec
     Total Data Exchanged : 6249.90 MB
     CPU Utilization      : 25.30
     Operation Throughput : 21952.66 ops/sec
     Bandwidth            : 1372.2 MB/sec

dapltest -T P -m p -s cst-50-ib0 -i 10000 RR 65535

RDMA_READS
     Total Time           : 4.67 sec
     Total Data Exchanged : 6249.90 MB
     CPU Utilization      : 25.26
     Operation Throughput : 21384.77 ops/sec
     Bandwidth            : 1336.52 MB/sec



More information about the general mailing list