[ofw][patch] [mlx4] Fixing a race during LSO header building (WQE bug)
Alex Naslednikov
xalex at mellanox.co.il
Thu Dec 11 08:02:39 PST 2008
Theoretically, the following bug can occur: LSO header may take less
than 64 Bytes (cacheline size), and function build_lso_seg (qp.c) will
notify that the header is ready. Than prefetch algoritm may take this
cache block BEFORE the rest of the WQE will be completed (S/G element
list)
Explanations:
WQE structure:
1-st 64 bytes (always aligned to cache line of 64)
00 CTRL start (first DWORD)
..
03 CTRL end
04 Dgram start
..
15 Dgram end
----
2-nd 64 bytes (also always aligned to cache line)
LSO Header:
16 MSS | LEN
17 IPOIB HEADER
18 IPOIB HEADER
19 TCP+IP HEADERS
....
When TCP+IP headers exceed 13 DWORDS, we always set OPCODE_INVALID_BIT
to be true, thus preventing WQE bug to occur.
But consider the following situation, when TCP and IP Header each takes
5 DWORDS length (minimal possible length)
Thus, 2nd 64 bytes of WQE will be as following:
16 MSS | LEN
17 IPoIB header
18 IPoIB header
19 IP header start
...
23 IP header end
24 TCP Header start
...
28 TCPHeader end
29 -------
30 S/G element list will start there
31 -------
The function that build LSO segment (build_lso_seg) will overright the
stamping right after memcpy() of the headers, thus PREFETCH mechanism
may theoretically take bytes 16:31 BEFORE bytes 29-31 were actually
filled with valid data.
Signed-off by: Alexander Naslednikov (xalex at mellanox.co.il)
Index: D:/Windows/MLNX_WINOF_2_0_0/hw/mlx4/kernel/bus/ib/qp.c
===================================================================
--- D:/Windows/MLNX_WINOF_2_0_0/hw/mlx4/kernel/bus/ib/qp.c
(revision 3608)
+++ D:/Windows/MLNX_WINOF_2_0_0/hw/mlx4/kernel/bus/ib/qp.c
(revision 3609)
@@ -1507,8 +1507,17 @@
}
#define I64_CACHE_LINE 64
#define OPCODE_INVALID_BIT 6
- // WQE bug treatment for LSO case, when
LSO header is large enough
- if (unlikely (seglen > I64_CACHE_LINE))
{
+ // WQE bug treatment for LSO case
+ // If LSO segment is large enough
(exceeds one cache block in size)
+ // or if it small enough such that S/G
element will be placed within the same cache block,
+ // OPCODE_INVALID_BIT should be on in
order to reread this WQE
+ // More correct solution is
+ // if (unlikely (seglen %
I64_CACHE_LINE < 4) ))
+ // but it will not be used in order to
reduce calculations within Datapath
+ // If LSO segment consists of 15 DWORDS,
S/G elements block will nevertheless start from
+ // the next cache block
+ if (unlikely (seglen < I64_CACHE_LINE-4
|| seglen > I64_CACHE_LINE )) {
+
ctrl->owner_opcode |=
cpu_to_be32 ( 1 << OPCODE_INVALID_BIT);
}
wqe += seglen;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wqe_bug.diff
Type: application/octet-stream
Size: 1275 bytes
Desc: wqe_bug.diff
URL: <http://lists.openfabrics.org/pipermail/ofw/attachments/20081211/ef08c509/attachment.obj>
More information about the ofw
mailing list