[ofa-general] [RFC] ipoib: avoid using stale ipoib_neigh* in ipoib_neigh_cleanup()

akepner at sgi.com akepner at sgi.com
Tue May 19 14:55:05 PDT 2009


We've seen a few instances of a crash in ipoib_neigh_cleanup() due to 
the use of a stale pointer:


848         neigh = *to_ipoib_neigh(n); <- read neigh (no locking)
.....
858         spin_lock_irqsave(&priv->lock, flags);
859
860         if (neigh->ah) <--- at this point neigh may be stale
861                 ah = neigh->ah;
862         list_del(&neigh->list);
863         ipoib_neigh_free(n->dev, neigh);
864
865         spin_unlock_irqrestore(&priv->lock, flags);


(Mentioned this here:
http://lists.openfabrics.org/pipermail/ewg/2008-April/006459.html)

We've been using a patch which re-reads neigh after the spinlock is 
taken. It's been effective in practice, but there's still a window 
where it's possible to use the stale pointer.

I've been looking into a proper fix for this, and I'd like to solicit 
any ideas. 

First thought was to use RCU, e.g., instead of to_ipoib_neigh(), use:

static inline struct ipoib_neigh* ipoib_neigh_retrieve(struct neighbour *n)
{
	struct ipoib_neigh **np;
	np = (void*) n + ALIGN(offsetof(struct neighbour, ha) +
			       INFINIBAND_ALEN, sizeof(void *));
	return rcu_dereference(*np);
}

static inline void ipoib_neigh_assign(struct neighbour *n,
                                     struct ipoib_neigh *in)
{
	struct ipoib_neigh **np;
	np = (void*) n + ALIGN(offsetof(struct neighbour, ha) +
			       INFINIBAND_ALEN, sizeof(void *));
	rcu_assign_pointer(*np, in);
}

where ipoib_neigh_retrieve() is done under rcu_read_lock() and 
ipoib_neigh_assign() under some spinlock (ipoib_dev_priv's lock 
might be repurposed for that use).

But that approach gets more complicated than seems warranted 
(partly because there's a need to promote readers to writers in 
a few places...).



Second thought was to use new locks to serialize access to the 
ipoib_neigh pointer stashed away in struct neighbour. Something 
like:

struct ipoib_neigh_lock
{
	spinlock_t sl;
}__attribute__((__aligned__(SMP_CACHE_BYTES)));

#define IPOIB_LOCK_SHIFT        6
#define IPOIB_LOCK_SIZE         (1 << IPOIB_LOCK_SHIFT)
#define IPOIB_LOCK_MASK         (IPOIB_LOCK_SIZE -1)

static struct ipoib_neigh_lock
ipoib_neigh_locks[IPOIB_LOCK_SIZE] __cacheline_aligned;

static inline void lock_ipoib_neigh(unsigned int hval)
{
	spin_lock(&ipoib_neigh_locks[hval & IPOIB_LOCK_MASK].sl);
}

static inline void unlock_ipoib_neigh(unsigned int hval)
{
	spin_unlock(&ipoib_neigh_locks[hval & IPOIB_LOCK_MASK].sl);
}

unsigned int ipoib_neigh_hval(struct neighbour *n);

....

static void ipoib_neigh_cleanup(struct neighbour *n)
{
	.....
        unsigned int hval = ipoib_neigh_hval(n);

        lock_ipoib_neigh(hval);

        neigh = *to_ipoib_neigh(n);
        if (neigh)
                priv = netdev_priv(neigh->dev);
        else
                return;
	....
        spin_lock_irqsave(&priv->lock, flags);

        if (neigh->ah)
                ah = neigh->ah;
        list_del(&neigh->list);
        ipoib_neigh_free(n->dev, neigh);

        spin_unlock_irqrestore(&priv->lock, flags);

        unlock_ipoib_neigh(hval);
....


This seems much simpler, but maybe there are better approaches?

-- 
Arthur




More information about the general mailing list