[ofa-general] [PATCH 2/3] RDMA/cxgb3: fix page shift calculation in build_phys_page_list()

Steve Wise swise at opengridcomputing.com
Mon Jan 28 08:55:00 PST 2008


Bryan S Rosenburg wrote:
> 
> Roland Dreier <rdreier at cisco.com> wrote on 01/27/2008 04:33:33 PM:
>  > got it... I was tricking myself that the check for alignment at the
>  > start of the function was sufficient, but it's not once we start using
>  > a bigger value of shift.
>  >
>  > I think the patch below should be a fix for the problem, although I've
>  > only compile tested it.  The idea is to stop increasing shift once it
>  > reaches a bit position where the first buffer and the iova differ.
>  > What do you think?  If this works for you, I will merge it for 2.6.25.
> 
> I've tested your fix with a two-element buffer list and various 
> combinations of virtual and physical alignments.  It works as intended, 
> allowing a larger page size (and corresponding expansion of the physical 
> region) if and only if the virtual and physical addresses are 
> sufficiently compatible.
> 
> I think you can simplify the code somewhat if you incorporate the 
> (virtual ^ physical) alignment characterization into the mask.  Here's 
> an alternative patch.  The initial alignment check gets subsumed into 
> the later mask alignment check.
> 
> ================================================================================ 
> 
> 
> diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c 
> b/drivers/infiniband/hw/mthca/mthca_provider.c
> index 6bcde1c..4e04b4f 100644
> --- a/drivers/infiniband/hw/mthca/mthca_provider.c
> +++ b/drivers/infiniband/hw/mthca/mthca_provider.c
> @@ -929,11 +929,7 @@ static struct ib_mr *mthca_reg_phys_mr(struct ib_pd 
>       *pd,
>         int err;
>         int i, j, n;
> 
> -        /* First check that we have enough alignment */
> -        if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & 
> ~PAGE_MASK))
> -                return ERR_PTR(-EINVAL);
> -
> -        mask = 0;
> +        mask = buffer_list[0].addr ^ *iova_start;
>         total_size = 0;
>         for (i = 0; i < num_phys_buf; ++i) {
>                 if (i != 0)
> @@ -948,16 +944,16 @@ static struct ib_mr *mthca_reg_phys_mr(struct 
> ib_pd       *pd,
>                 return ERR_PTR(-EINVAL);
> 
>         /* Find largest page shift we can use to cover buffers */
> -        for (shift = PAGE_SHIFT; shift < 31; ++shift)
> -                if (num_phys_buf > 1) {
> -                        if ((1ULL << shift) & mask)
> -                                break;
> -                } else {
> +        for (shift = PAGE_SHIFT; shift < 31; ++shift) {
> +                if ((1ULL << shift) & mask)
> +                        break;
> +                if (num_phys_buf == 1) {
>                         if (1ULL << shift >=
>                             buffer_list[0].size +
>                             (buffer_list[0].addr & ((1ULL << shift) - 1)))
>                                 break;
>                 }
> +        }
> 
>         buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1);
>         buffer_list[0].addr &= ~0ull << shift;
> ================================================================================ 
> 
> 
> I'm still annoyed by the (num_phys_buf == 1) special case.  I'm 
> wondering if it's still needed.  If you leave out that if-statement 
> entirely, you may end up using a page size that is larger (maybe much 
> larger) than necessary, but I think things will still work, given that 
> the virtual-to-physical alignment constraints are respected.  If you 
> remove the special case, you can replace the whole loop with an ffs() call.
> 
> Anyway, your patch works fine.  Use my suggestion only if you like it.
> 
> - Bryan
> 

So is cxgb3 still busted?  IE I still need that additional patch you 
sent?  Perhaps I can align cxgb3 and mthca to the same logic.  Maybe 
create a core helper function...

Steve.




More information about the general mailing list