[ofa-general] Re: [PATCH/RFC] IPoIB: Allocate priv->tx_ring with vmalloc()

Eli Cohen eli at dev.mellanox.co.il
Wed Mar 12 06:41:15 PDT 2008


On Tue, 2008-03-11 at 20:20 -0700, Roland Dreier wrote:
> Commit 7143740d ("IPoIB: Add send gather support") made struct
> ipoib_tx_buf significantly larger, since the mapping member changed
> from a single u64 to an array with MAX_SKB_FRAGS + 1 entries.  This
> means that allocating tx_rings with kzalloc() may fail because there
> is not enough contiguous memory for the new, much bigger size.  Fix
> this regression by allocating the rings with vmalloc() instead.
> 
> Signed-off-by: Roland Dreier <rolandd at cisco.com>
> ---
> I've also tentatively queued this up for 2.6.25, since it fixes a
> regression introduced by making the tx_ring much bigger.
> 
> While writing this patch, I noticed that we seem to waste a lot of
> memory for connected mode tx_rings, since there's no way we would ever
> use gather sends with the current code.  Does it make sense to use a
> different tx_buf structure for 2.6.26 to shrink the memory use back down?
I noticed that too and I am not sure what the solution should be. From
one side, for CM mode we clear NETIF_F_SG but then we use an MTU of
65520 bytes and it might not be so trivial for the networking stack to
provide SKB with a linear data so large. Maybe we could use a struct
defined like this:

struct ipoib_tx_buf {
	struct sk_buff *skb;
	u64		mapping[0];
};

and when allocating, allocate as much memory as needed to cover the
required size of the "mapping" array. We'll have to use kmllaoc n times
according to the size of the tx ring. This could be good also for 32 bit
systems where vmalloc area is small.


> 
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> index 4e8d028..0fd9f0a 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
> @@ -38,6 +38,7 @@
>  #include <net/icmp.h>
>  #include <linux/icmpv6.h>
>  #include <linux/delay.h>
> +#include <linux/vmalloc.h>
>  
>  #include "ipoib.h"
>  
> @@ -1031,13 +1032,13 @@ static int ipoib_cm_tx_init(struct ipoib_cm_tx *p, u32 qpn,
>  	struct ipoib_dev_priv *priv = netdev_priv(p->dev);
>  	int ret;
>  
> -	p->tx_ring = kzalloc(ipoib_sendq_size * sizeof *p->tx_ring,
> -				GFP_KERNEL);
> +	p->tx_ring = vmalloc(ipoib_sendq_size * sizeof *p->tx_ring);
>  	if (!p->tx_ring) {
>  		ipoib_warn(priv, "failed to allocate tx ring\n");
>  		ret = -ENOMEM;
>  		goto err_tx;
>  	}
> +	memset(p->tx_ring, 0, ipoib_sendq_size * sizeof *p->tx_ring);
>  
>  	p->qp = ipoib_cm_create_tx_qp(p->dev, p);
>  	if (IS_ERR(p->qp)) {
> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> index f96477a..0658f0b 100644
> --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
> @@ -41,6 +41,7 @@
>  #include <linux/init.h>
>  #include <linux/slab.h>
>  #include <linux/kernel.h>
> +#include <linux/vmalloc.h>
>  
>  #include <linux/if_arp.h>	/* For ARPHRD_xxx */
>  
> @@ -887,13 +888,13 @@ int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port)
>  		goto out;
>  	}
>  
> -	priv->tx_ring = kzalloc(ipoib_sendq_size * sizeof *priv->tx_ring,
> -				GFP_KERNEL);
> +	priv->tx_ring = vmalloc(ipoib_sendq_size * sizeof *priv->tx_ring);
>  	if (!priv->tx_ring) {
>  		printk(KERN_WARNING "%s: failed to allocate TX ring (%d entries)\n",
>  		       ca->name, ipoib_sendq_size);
>  		goto out_rx_ring_cleanup;
>  	}
> +	memset(priv->tx_ring, 0, ipoib_sendq_size * sizeof *priv->tx_ring);
>  
>  	/* priv->tx_head, tx_tail & tx_outstanding are already 0 */
>  




More information about the general mailing list