[nvmewin] update on strange perf issue

Luse, Paul E paul.e.luse at intel.com
Tue Jan 24 15:23:46 PST 2012


Here's the update:


-         If we use the storport DPC optimizations, our performance becomes erratic

-         If we don't use them, we end up in the same failing scenario that we had with ISR completions:

o   Get 'stuck' in a mode where we only get IOs on one core and therefore only complete IOs on one core however that's not sufficient for operation as we end up with a DPC watchdog timeout

-         I noticed the core that ends up 'stuck' every time is the same, core 0.

-         If I change our core mapping such that we never complete an IO on the same core that it came in on, we have no issues and peak IOPs in a 4K read case does not change however our CPU util goes from 20% to 60% saying a lot for our core/vector matching!

-         If I try in full shared mode (already coded, just hacked it to pretend that we only got one message from the OS) we still fail under heavy IO

-         So taking note that we share the MsgId/completion core between the admin queue and queue pair 1, I isolated the admin queue by moving incoming core 0 IO requests to queue pair 2 so it looks like the table below and things work fine and our CPU util stays at 20%.  Not clear what is causing this and I've been working with out HW/FW folks to determine if there's some interaction there and so far no leads from that direction.  It almost feels like obviously a bug in our stuff or some strange behavior with the OS and core 0.  Obviously the former seems more plausible but I'm not able to pinpoint anything other than the data below (and the workaround of not sharing completions with the admin queue seems to solve it).


-        DOESN"T WORK
Core

SQ/CQ #

Msg #

0

1 + admin

8

1

2

1

2

3

2

4

4

3

4

5

4

5

6

5

6

7

6

7

8

7


-

-        WORKS:
Core

SQ/CQ #

Msg #

0

2 (+ admin)

1  (8)

1

2

1

2

3

2

4

4

3

4

5

4

5

6

5

6

7

6

7

8

7


-








____________________________________
Paul Luse
Sr. Staff Engineer
PCG Server Software Engineering
Desk: 480.554.3688, Mobile: 480.334.4630

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/nvmewin/attachments/20120124/0c8e65fa/attachment.html>


More information about the nvmewin mailing list