[ofa-general] [PATCH 1 of 2] libmlx4: Handle new FW requirement for send request prefetching, for WQE sg lists
Jack Morgenstein
jackm at dev.mellanox.co.il
Tue Sep 4 00:37:13 PDT 2007
This is an addendum to Roland's commit 561da8d10e419ffb333fe6faf05004d9a3670e7a
(June 13). This addendum adds prefetch headroom marking processing for s/g segments.
We write s/g segments in reverse order into the WQE, in order to guarantee
that the first dword of all cachelines containing s/g segments is written last
(overwriting the headroom invalidation pattern). The entire cacheline will thus
contain valid data when the invalidation pattern is overwritten.
Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>
Index: libmlx4/src/qp.c
===================================================================
--- libmlx4.orig/src/qp.c 2007-09-04 10:03:38.264742000 +0300
+++ libmlx4/src/qp.c 2007-09-04 10:04:35.536784000 +0300
@@ -312,10 +312,19 @@ int mlx4_post_send(struct ibv_qp *ibqp,
} else {
struct mlx4_wqe_data_seg *seg = wqe;
- for (i = 0; i < wr->num_sge; ++i) {
- seg[i].byte_count = htonl(wr->sg_list[i].length);
+ /*
+ * Write the s/g entries in reverse order, so that the
+ * first dword of all cachelines is written last.
+ */
+ for (i = wr->num_sge - 1; i >= 0; --i) {
seg[i].lkey = htonl(wr->sg_list[i].lkey);
seg[i].addr = htonll(wr->sg_list[i].addr);
+ /*
+ * This entry may start a new cacheline.
+ * See barrier comment above.
+ */
+ wmb();
+ seg[i].byte_count = htonl(wr->sg_list[i].length);
}
size += wr->num_sge * (sizeof *seg / 16);
More information about the general
mailing list