[openib-general] ip over ib throughtput
Grant Grundler
iod00d at hp.com
Wed Jan 12 17:34:06 PST 2005
On Tue, Jan 04, 2005 at 01:10:15PM -0800, Roland Dreier wrote:
> Josh> I'm seeing about 364 MB/s between 2 PCIe Xeon 3.2GHz boxes
> Josh> using netperf-2.3pl1.
>
> Are you using MSI-X? To use it, set CONFIG_PCI_MSI=y when you build
> your kernel and either "modprobe ib_mthca msi_x=1"...
Good news: Topspin firmware 3.3.2 can run netperf w/MSI-X on ia64 too
Bad news: I'm getting weak perf #s on the ZX1 boxes (~1580 Mbps == ~200MB/s)
This is with MSI-X enabled on both systems.
RX2600 sending TCP_Stream packets to RX4640 via topspin 12port switch.
Rx2600 has "Low Profile" (Cougarcub) and rx4640 has "Cougar" installed
in "dual rope" slots.
/opt/netperf/netperf -l 60 -H 10.0.1.81 -t TCP_STREAM -i 5,2 -I 99,5 -- -m 8192 -s 262144 -S 262144
TCP STREAM TEST to 10.0.1.81 : +/-2.5% @ 99% conf.
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec
262142 262142 8192 60.00 1588.33
q-syscollect on netperf client (RX2600, dual 1.5Ghz):
ionize:~/.q# q-view kernel-cpu1.info#0 | less
Flat profile of CPU_CYCLES in kernel-cpu1.hist#0:
Each histogram sample counts as 1.00034m seconds
% time self cumul calls self/call tot/call name
25.09 14.98 14.98 80.7k 186u 186u default_idle
9.73 5.81 20.79 35.9M 162n 162n _spin_unlock_irqrestore
5.63 3.36 24.15 27.8M 121n 136n ipt_do_table
4.27 2.55 26.70 15.0M 170n 170n do_csum
3.49 2.08 28.78 6.95M 300n 300n __copy_user
2.66 1.59 30.37 14.3M 111n 673n nf_iterate
2.63 1.57 31.94 5.82M 270n 729n tcp_transmit_skb
2.59 1.54 33.49 68.5M 22.5n 33.2n local_bh_enable
2.33 1.39 34.88 6.79M 205n - tcp_packet
1.83 1.09 35.97 355k 3.08u 32.4u tcp_sendmsg
1.57 0.94 36.91 2.32M 405n 2.11u ipoib_ib_completion
1.48 0.88 37.79 5.92M 149n 162n ip_queue_xmit
1.46 0.87 38.67 2.46M 354n 2.41u mthca_eq_int
1.20 0.72 39.39 6.93M 104n 376n ip_conntrack_in
1.17 0.70 40.08 7.52M 92.6n 92.6n time_interpolator_get_o
ffset
...
And on the "netserver" (RX4640, 4 1.3Ghz) side:
t profile of CPU_CYCLES in kernel-cpu3.hist#0:
Each histogram sample counts as 551.305u seconds
% time self cumul calls self/call tot/call name
34.69 18.97 18.97 16.6M 1.15u 1.15u do_csum
7.58 4.15 23.12 19.4M 213n 213n _spin_unlock_irqrestore
6.67 3.65 26.76 61.4k 59.4u 59.4u default_idle
5.33 2.91 29.68 22.3M 131n 149n ipt_do_table
3.02 1.65 31.33 1.93M 856n 8.35u ipoib_ib_completion
2.73 1.49 32.82 6.45M 231n 231n __copy_user
2.61 1.43 34.25 11.2M 128n 1.32u nf_iterate
2.30 1.26 35.51 5.55M 227n - tcp_packet
2.06 1.12 36.63 51.3M 21.9n 25.4n local_bh_enable
1.97 1.08 37.71 5.51M 195n 273n tcp_v4_rcv
1.43 0.78 38.49 1.77M 443n 9.63u mthca_eq_int
1.35 0.74 39.23 5.28M 139n 1.93u netif_receive_skb
1.19 0.65 39.88 5.60M 116n 1.59u ip_conntrack_in
1.14 0.62 40.50 5.53M 113n 2.92u tcp_rcv_established
1.03 0.56 41.06 5.31M 106n 135n ip_route_input
1.02 0.56 41.62 5.24M 107n 1.80u ip_rcv
0.91 0.50 42.12 5.43M 91.6n 369n ip_local_deliver_finish
0.90 0.49 42.61 5.51M 89.7n 89.7n netif_rx
0.89 0.49 43.10 1.93M 253n 9.13u handle_IRQ_event
0.85 0.46 43.56 33.7M 13.8n 13.8n _read_lock_bh
...
_spin_unlock_irqrestore is a clue we are spending time in interrupt handlers
and that isn't getting measured.
top was reporting "netserver" consuming ~80% of one CPU
and netperf consuming ~60% of one CPU. Other cpu's were idle
on both boxes. Something else is slowing things down...I know
these boxes are capable of 800-900 MB/s on the PCI bus.
hth,
grant
More information about the general
mailing list