[ofa-general] SRP/mlx4 interrupts throttling performance
Vladislav Bolkhovitin
vst at vlnb.net
Thu Nov 20 07:24:18 PST 2008
Cameron Harr wrote:
> New results, with markers.
> ----
> type=randwrite bs=512 drives=1 scst_threads=1 srptthread=1 iops=65612.40
> type=randwrite bs=4k drives=1 scst_threads=1 srptthread=1 iops=54934.31
> type=randwrite bs=512 drives=2 scst_threads=1 srptthread=1 iops=82514.57
> type=randwrite bs=4k drives=2 scst_threads=1 srptthread=1 iops=79680.42
> type=randwrite bs=512 drives=1 scst_threads=2 srptthread=1 iops=60439.73
> type=randwrite bs=4k drives=1 scst_threads=2 srptthread=1 iops=51510.68
> type=randwrite bs=512 drives=2 scst_threads=2 srptthread=1 iops=102735.07
> type=randwrite bs=4k drives=2 scst_threads=2 srptthread=1 iops=78558.77
> type=randwrite bs=512 drives=1 scst_threads=3 srptthread=1 iops=62941.35
> type=randwrite bs=4k drives=1 scst_threads=3 srptthread=1 iops=51924.17
> type=randwrite bs=512 drives=2 scst_threads=3 srptthread=1 iops=120961.39
> type=randwrite bs=4k drives=2 scst_threads=3 srptthread=1 iops=75411.52
> type=randwrite bs=512 drives=1 scst_threads=1 srptthread=0 iops=50891.13
> type=randwrite bs=4k drives=1 scst_threads=1 srptthread=0 iops=50199.90
> type=randwrite bs=512 drives=2 scst_threads=1 srptthread=0 iops=58711.87
> type=randwrite bs=4k drives=2 scst_threads=1 srptthread=0 iops=74504.65
> type=randwrite bs=512 drives=1 scst_threads=2 srptthread=0 iops=61043.73
> type=randwrite bs=4k drives=1 scst_threads=2 srptthread=0 iops=49951.89
> type=randwrite bs=512 drives=2 scst_threads=2 srptthread=0 iops=83195.60
> type=randwrite bs=4k drives=2 scst_threads=2 srptthread=0 iops=75224.25
> type=randwrite bs=512 drives=1 scst_threads=3 srptthread=0 iops=60277.98
> type=randwrite bs=4k drives=1 scst_threads=3 srptthread=0 iops=49874.57
> type=randwrite bs=512 drives=2 scst_threads=3 srptthread=0 iops=84851.43
> type=randwrite bs=4k drives=2 scst_threads=3 srptthread=0 iops=73238.46
I think srptthread=0 performs worse in this case, because with it part
of processing done in SIRQ, but seems scheduler make it be done on the
same CPU as fct0-worker, which does the data transfer to your SSD device
job. And this thread is always consumes about 100% CPU, so it has less
CPU time, hence less overall performance.
So, try to affine fctX-worker, SCST threads and SIRQ processing on
different CPUs and check again. You can affine threads using utility
from http://www.kernel.org/pub/linux/kernel/people/rml/cpu-affinity/,
how to affine IRQ see Documentation/IRQ-affinity.txt in your kernel tree.
Vlad
More information about the general
mailing list