[openib-general] [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour

Thu Feb 8 07:00:07 PST 2007

Michael S. Tsirkin wrote:
>> Quoting Moni Shoua <monis at voltaire.com>:
>> Subject: Re: [PATCH] IB/ipoib get net_device from ipoib_neigh instead of linux neighbour
>>
>>
>>> Another concern: assume that one device goes away (e.g. hotplug).
>>> It seems that neighbours whose dev field point to another device, will not be destroyed.
>>> Correct?
>> I agree.
>>
>>> Therefore in your design, it seems that to_ipoib_neigh()->dev
>>> will get us a pointer to device that has been removed already.
>>>
>> I agree that this is a problem.
> 
> I think we can solve this if we track all ipoib neighbours, like we do for old kernels,
> and then flush ipoib neighbours on any hotplug event.
> Roland, does this sound too awful?
> 
>> It think it would be best to prevent an IPoIB device
>> from disappearing or from ib_ipoib from being unloaded as long as IPoIB
>> device is a slave. Unfortunately, I don't see how this can be done just
>> by fixing something in bonding or IPoIB. 
> 
> So hotplug is blocked potentially forever?
> This does not sound good.
OK, so I'm dropping this thought.
> 
>> However, any slave knows he has a master (dev->master). 
>> What do you think about a solution where IPoIB first tries to clean up the
>> neighbours that belong to it's master before deleting the IPoIB device?
> 
> How?
Let me know what do you think about that. I hope this makes sense.
in IPoIB, before calling unregister_netdev do
	for each kernel neighbour n
		if  n->dev == ib_dev->master
			delete n

Michael, as I see it we have to deal with 2 cases.
1. IPoIB device is deleted (unregister_netdev) - IPoIB netdev in not in the kernel's address space.
	we have to make sure that no one holds a pointer to it after it is deleted.
2 ib_ipoib module is unloaded (modprobe -r) - the ipoib_neigh_destructor is not in the kernel's address space.
	we have to make sure no one calls to it after the module is unloaded.
I think that if nothing prevents the execution of the "code" above it serves both cases.
Do you see any problem with that?
Do I have to maintain my own list of neighbours or use the kernel's arp table for that?

I am trying to study the neighbour cleanup function and do something like that but
I would be happy to learn from others as well.

>>>> Furthermore, bond_setup_by_slave is called only for non
>>>> Ethernet devices (we consider to change the logic to "called only for
>>>> IPoIB devices just for safety).
>>> Why is this necessary, BTW?
>>>
>> If we don't do that, we get a memory leak because the neigh destructor will
>> never be called for non IPoIB devices although they carry ipoib_neigh
>> with them.
> 
> How can this happen? If it does, I think we are back to where we started:
> to_ipoib_neigh is broken for non-IPoIB device.
> I thought you said only devices of the same type can be paired?
> 
> 
The scenario is:
1. kernel allocates a neighbour structure for bond0, puts it on a skb and passed it to bond xmit function.
2. bond0 passes the skb to ipoib
3. ipoib allocates ipoib_neigh and hangs it on linux neighbour. 
4. a while after that, the kernel wants to destroy the neighbour (cleanup) but 
doesn't call ipoib_neigh_destructor because it the neigh setup registered the destructor for ibX device.