[ofa-general] Chelsio T3: Aggregate Throughput

Steve Wise swise at opengridcomputing.com
Thu Feb 5 13:05:49 PST 2009


Philip Frey1 wrote:
>
> Hello,
>
> we am currently looking into the scalability of the T3 in terms of
> connections. We are using a 1-to-n scenario where the one server
> has a chunk of data and n client that fetch this chunk over and over
> again using RDMA reads (each 1MB in size).
>
> The clients do that such that they get an average data rate of about
> 9Mbps each. Every second we connect a new client to the server
> and see how far it goes.
>
> What puzzles us now is that after about 800 clients, they do no longer
> seem to receive much data.
>
> The first interesting thing is that the aggregate throughput actually 
> drops
> (we expected it to stall). And the second interesting thing is that it 
> does
> so already at about 6.3Gbps which is just a bit more than half of what 
> the
> card can do. We do not experience this kind of situation when using
> much less clients that RDMA read the data at a much higher data rate.
>
> Is there any limitation on the RNIC that would give an explanation for 
> this?
>

Are the RNICs experiencing lots of pause frames during the test? 

ethtool -S ethX|grep Pause

Also, are the iWARP stacks retransmitting a lot during the test? 

cat /sys/class/infiniband/cxgb3_0/proto_stats/tcpRetransSegs


Steve.



More information about the general mailing list