[ewg] IB/qib: patches from recent kernel.org submissions

Vladimir Sokolovsky vlad at dev.mellanox.co.il
Sun Sep 25 05:25:26 PDT 2011


On 09/23/2011 09:37 PM, Mike Marciniszyn wrote:
> Vlad,
>
> Please pull the following patches from
> git.openfabrics.org/~mmarciniszyn/scm/linux-2.6.to_ofed
>
> The kernel.org patches are just now hitting the lists.
>
> Thanks!
> Mike


Done,

Regards,
Vladimir

>
> commit a7f7c501f9cf0a1454323c5f6f8668be2ddf2b1d
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Fri Sep 23 12:04:24 2011 -0400
>
>      IB/qib: Add logic for affinity hint
>
>      Call irq_set_affinity_hint to provide user mode programs
>      like irqbalance the information to be able to distribute
>      qib interrupts appropriately.
>
>      The logic allocates all non-receive interrupts to the first
>      CPU local to the HCA.  Receive interrupts are allocated round
>      robin starting with the second CPU local to the HCA with
>      potential wrap back to the second CPU.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit 96ecdac8d78e2d20b32eb4ec50643bcb1c03f828
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Fri Sep 23 11:10:49 2011 -0400
>
>      IB/qib: Add irq name refinements
>
>      This patch adds a refinement to the name registered with
>      MSIX interrupts so that user level scripts can determine
>      the device associated with the IRQs when there are
>      multiple HCAs with a potentially different set of
>      local CPUs.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit d3d7ea0a142f1545f9bfbcd9e3bd47b7b2f9ce79
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Fri Sep 23 10:59:03 2011 -0400
>
>      IB/qib: remove s_lock around header validation
>
>      Observation in qib_ruc_check_hdr() shows that the s_lock is not required
>      in the normal case.   The r_lock is held in all cases, and protects
>      the qp fields that are read.
>
>      The s_lock will be needed to around the call to qib_migrate_qp() to
>      insure that the send engine sees a consistent set of fields.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit 6aeceea336753dfc6b2987594b228e65b3a46982
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Fri Sep 23 10:46:13 2011 -0400
>
>      IB/qib: memcpy optimizations
>
>      The default memcpy used by qib_copy_sge() ends up being a rep movsb on x86_64,
>      which is pretty slow.
>
>      This fix adds an x86_64 specific routine that 1) probes for X86_FEATURE_REP_GOOD
>      and 2) uses an inline asm routine builton rep movsq that testing has shown is
>      better than the builtin memcpy for all cases up to 4K.  The probing routine is
>      now called when the qib module is loaded to enable the optimization.   When
>      X86_FEATURE_REP_GOOD is not set, the routine uses the kernel's unrolled __memcpy
>      when the length is more than 64 and the builtin memcpy otherwise.
>
>      This patch also adds the cache bypass copies from older releases.  Testing has
>      shown that AMD cpus benefit with a 40% improvement in netperf/ipoib.
>
>      The cache_bypass_copy module parameter can be used to enable on non-AMD
>      CPUs.
>
>      The qib_verbs_send_dma() and qib_copy_from_sge are also changed to use the
>      memcpy_string_op() to improve packet delivery performance to the
>      send engine.
>
>      The existing copy as well as a new stub probe routine are maintained as weak
>      symbols for other architectures.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit 9ecf3abd1d880255e2f9be0e9714dcd49c97fede
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Fri Sep 23 09:44:07 2011 -0400
>
>      IB/qib: precompute timeout jiffies to optimize latency
>
>      A new field is added to qib_qp called timeout_jiffies. It
>      is initialized upon create and modify.
>
>      The field is now used vs. the computation based on qp->timeout.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit b718e7f17cc9fc9d31148ec046bf071bdd2c105d
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Thu Sep 22 14:43:48 2011 -0400
>
>      IB/qib: qpn lookup optimizations
>
>      The heavy weight spinlock in qib_lookup_qpn() is replaced with
>      the RCU locking mechanism.  The hash list itself is now accessed
>      via jhash functions vs. the mod.
>
>      The changes should benefit multiple receive contexts in different
>      processors by not contending for the lock to just read the hash
>      structures.
>
>      The patch also adds a lookaside_qp (pointer) and a lookaside_qpn
>      in the context.  The interrupt handler will test the current
>      packet's qpn against lookaside_qpn if the lookaside_qp pointer
>      is non-null.  The pointer is NULL'ed when the interrupt
>      handler exits.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit dd2e48e7c02c0925a5e603d1a9752d20b7e15ed3
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Thu Sep 22 11:50:04 2011 -0400
>
>      IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
>
>      The context init now saves a shift from rcvegrbufs_perchunk
>      rcvegrbufs_perchunk_shift using ilog2.   A BUG_ON protects the
>      power of 2 assumption.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit 3eef5499db4d04aae5d10580b3735696957cd99d
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Thu Sep 22 11:40:13 2011 -0400
>
>      IB/qib: decode path mtu optimization
>
>      Store both the encoded and decoded mtu in the qp structure as a minor
>      optimization UC/RC receive routines.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
> commit 2871fc8a45df95c52cac9707e886383ec712a20e
> Author: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
> Date:   Thu Sep 22 11:24:39 2011 -0400
>
>      IB/qib: Optimize RC/UC code by IB operation
>
>      The memset for zeroing the work completion had been unconditional.
>
>      This patch removes the memset and moves the zeroing into the work
>      completion with a more explicit field by field set.   With this patch,
>      non-ONLY/non-LAST packets will avoid the overhead since they will
>      not generate a completion.
>
>      Signed-off-by: Mike Marciniszyn<mike.marciniszyn at qlogic.com>
>
>
> This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
>
>




More information about the ewg mailing list