[ofa-general] [PATCH] IB/mlx4: Micro-optimize mlx4_ib_poll_one()
Roland Dreier
rdreier at cisco.com
Thu Jan 3 19:51:28 PST 2008
Rather than byte-swapping cqe->g_mlpath_rqpn each time we extract a
field from it, byte-swap it once into a temporary variable. This
results in smaller, better code -- eg, on 32-bit x86:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-5 (-5)
function old new delta
mlx4_ib_poll_cq 1188 1183 -5
Signed-off-by: Roland Dreier <rolandd at cisco.com>
---
I've queued this for 2.6.25...
drivers/infiniband/hw/mlx4/cq.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index 8bf44da..24d9475 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -313,6 +313,7 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq,
struct mlx4_ib_srq *srq;
int is_send;
int is_error;
+ u32 g_mlpath_rqpn;
u16 wqe_ctr;
cqe = next_cqe_sw(cq);
@@ -426,10 +427,10 @@ static int mlx4_ib_poll_one(struct mlx4_ib_cq *cq,
wc->slid = be16_to_cpu(cqe->rlid);
wc->sl = cqe->sl >> 4;
- wc->src_qp = be32_to_cpu(cqe->g_mlpath_rqpn) & 0xffffff;
- wc->dlid_path_bits = (be32_to_cpu(cqe->g_mlpath_rqpn) >> 24) & 0x7f;
- wc->wc_flags |= be32_to_cpu(cqe->g_mlpath_rqpn) & 0x80000000 ?
- IB_WC_GRH : 0;
+ g_mlpath_rqpn = be32_to_cpu(cqe->g_mlpath_rqpn);
+ wc->src_qp = g_mlpath_rqpn & 0xffffff;
+ wc->dlid_path_bits = (g_mlpath_rqpn >> 24) & 0x7f;
+ wc->wc_flags |= g_mlpath_rqpn & 0x80000000 ? IB_WC_GRH : 0;
wc->pkey_index = be32_to_cpu(cqe->immed_rss_invalid) >> 16;
}
--
1.5.3.5
More information about the general
mailing list