[ofa-general] why is CPU util/service demand so much higher with SDP than TCP?

Rick Jones rick.jones2 at hp.com
Thu Apr 26 18:21:54 PDT 2007


So, while playing around with my new netperf SDP_RR test I've noticed that a 
single-byte _RR test over SDP has a much higher transactions per second (ie 
lower latency) than over TCP over the same HCA, but the CPU utilization is 
_very_ much higher and the service demand (cpu per transaction) as well.  CPU 
util being higher makes sense with a higher transaction rate, but not the 
increased service demand - well at least not to my experience thusfar.

[root at hpcpc106 ~]# for i in SDP_RR TCP_RR; do netperf -t $i -l 60 -c -C -H 
192.168.0.107;done
SDP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.107 
(192.168.0.107) port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

126976 126976 1       1      60.00   37868.61  28.02  27.65  29.598  29.210
126976 126976
TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.0.107 
(192.168.0.107) port 0 AF_INET : first burst 0
Local /Remote
Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
Send   Recv   Size    Size   Time    Rate     local  remote local   remote
bytes  bytes  bytes   bytes  secs.   per sec  % S    % S    us/Tr   us/Tr

87380  87380  1       1      60.00   19281.49  3.40   3.90   7.049   8.089
87380  87380

The systems here are running RHEL5:
[root at hpcpc106 ~]# uname -a
Linux hpcpc106.cup.hp.com 2.6.18-8.el5 #1 SMP Fri Jan 26 14:16:09 EST 2007 ia64 
ia64 ia64 GNU/Linux

and whatever bits come with that (this is not OFED 1.2 rc bits - I still don't 
know how to remove enough of what ships with RHEL5 to put all of OFED 1.2 (well, 
the modules I want) on there without conflict.  I'm not sure how to check the 
versions - normally I'd use ethtool, but that doesn't work against an ibN 
device.  Someone elsewhere suggested that the bits in RHEL5 might be OFED 1.1.

These systems have four real cores, and no HW threads enabled, so 25% CPU util 
means that the equivalent of an entire CPU core is being consumed.

Before I start trying to hit the system with a profiler I thought I would ask if 
this was expected with SDP.  Normally a single-instance, single-byte  _RR test 
between otherwise identical systems consumes at most 50% of a core ( a bit 
handwaving, but that has been my experience thusfar)

rick jones



More information about the general mailing list