[ofa-general] Re: [PATCH V3 7/7] net/bonding: Delay sending of gratuitous ARP to avoid failure

Moni Shoua monisonlists at gmail.com
Tue Jul 31 06:33:20 PDT 2007


Jay Vosburgh wrote:
> Moni Shoua <monis at voltaire.com> wrote:
> 
>> Delay sending a gratuitous_arp when LINK_STATE_LINKWATCH_PENDING bit
>> in dev->state field is on. This improves the chances for the arp packet to
>> be transmitted.
> 
> 	Under what circumstances were you seeing problems that delaying
> the gratuitous ARP until linkwatch is done improves things?  Is this
> really an IB thing, or did you experience problems here over regular
> ethernet?
> 

I tried to figure out what is the difference in the state/flags of the device when 
grat. ARP send succeeds and when it fails. I found exact correlation with the 
LINK_STATE_LINKWATCH_PENDING bit on.
I don't think that this is an IB issue but I can't be sure. I didn't run tests
for Ethernet.

>> Signed-off-by: Moni Shoua <monis at voltaire.com>
>> ---
>> drivers/net/bonding/bond_main.c |   25 +++++++++++++++++++++----
>> drivers/net/bonding/bonding.h   |    1 +
>> 2 files changed, 22 insertions(+), 4 deletions(-)
>>
>> Index: net-2.6/drivers/net/bonding/bond_main.c
>> ===================================================================
>> --- net-2.6.orig/drivers/net/bonding/bond_main.c	2007-07-25 15:33:25.000000000 +0300
>> +++ net-2.6/drivers/net/bonding/bond_main.c	2007-07-26 18:42:59.296296622 +0300
>> @@ -1134,8 +1134,13 @@ void bond_change_active_slave(struct bon
>> 		if (new_active && !bond->do_set_mac_addr)
>> 			memcpy(bond->dev->dev_addr,  new_active->dev->dev_addr,
>> 				new_active->dev->addr_len);
>> -
>> -		bond_send_gratuitous_arp(bond);
>> +		if (bond->curr_active_slave &&
>> +			test_bit(__LINK_STATE_LINKWATCH_PENDING, &bond->curr_active_slave->dev->state)){
>> +			dprintk("delaying gratuitous arp on %s\n",bond->curr_active_slave->dev->name);
>> +			bond->send_grat_arp=1;
>> +		}else{
>> +			bond_send_gratuitous_arp(bond);
>> +		}
> 
> 	Style issues throughout the patch series: many lines are too
> long, many things are all smashed together, e.g., "}else{" instead of
> "} else {", "send_grat_arp=1" instead of "send_grat_arp = 1", and so on.
> 
OK thanks. I'll fix and repost.
>> 	}
>> }
>>
>> @@ -2120,6 +2125,15 @@ void bond_mii_monitor(struct net_device 
>> 	 * program could monitor the link itself if needed.
>> 	 */
>>
>> +	if (bond->send_grat_arp) {
>> +		if (bond->curr_active_slave && test_bit(__LINK_STATE_LINKWATCH_PENDING, &bond->curr_active_slave->dev->state))
>> +			dprintk("Needs to send gratuitous arp but not yet\n",__FUNCTION__);
>> +		else {
>> +			dprintk("sending delayed gratuitous arp on ond->curr_active_slave->dev->name\n");
>> +			bond_send_gratuitous_arp(bond);
>> +			bond->send_grat_arp=0;
>> +		}
>> +	}
> 
> 
>> 	read_lock(&bond->curr_slave_lock);
>> 	oldcurrent = bond->curr_active_slave;
>> 	read_unlock(&bond->curr_slave_lock);
>> @@ -2513,6 +2527,7 @@ static void bond_send_gratuitous_arp(str
>> 	struct slave *slave = bond->curr_active_slave;
>> 	struct vlan_entry *vlan;
>> 	struct net_device *vlan_dev;
>> +	int i;
>>
>> 	dprintk("bond_send_grat_arp: bond %s slave %s\n", bond->dev->name,
>> 				slave ? slave->dev->name : "NULL");
>> @@ -2520,8 +2535,9 @@ static void bond_send_gratuitous_arp(str
>> 		return;
>>
>> 	if (bond->master_ip) {
>> -		bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
>> -				  bond->master_ip, 0);
>> +		for (i=0;i<3;i++)
>> +			bond_arp_send(slave->dev, ARPOP_REPLY, bond->master_ip,
>> +					  bond->master_ip, 0);
>> 	}
> 
> 	If you delay the grat ARP until linkwatch is done, why is it
> also necessary to shotgun several ARPs instead of one?  Why are the ARPs
> sent for VLANs not also shotgunned in a similar fashion?
Besides the linkwatch issue I also noticed that on rare occasions, grat. ARPs
that found their way to the slave's xmit function were not xmitted.
The 3 times send is just an another attempt to improve chances.

I'd like to emphasize here that with IB slaves, grat. ARP is much more crucial to 
a successful change of slaves and that was my focus.

> 	If shotgunning like this really is useful, would it not make
> more sense to space them out a bit, e.g., over successive monitor
> passes?
> 
I guess you are right about that.
>> 	list_for_each_entry(vlan, &bond->vlan_list, vlan_list) {
>> @@ -4331,6 +4347,7 @@ static int bond_init(struct net_device *
>> 	bond->current_arp_slave = NULL;
>> 	bond->primary_slave = NULL;
>> 	bond->dev = bond_dev;
>> +	bond->send_grat_arp=0;
>> 	INIT_LIST_HEAD(&bond->vlan_list);
>>
>> 	/* Initialize the device entry points */
>> Index: net-2.6/drivers/net/bonding/bonding.h
>> ===================================================================
>> --- net-2.6.orig/drivers/net/bonding/bonding.h	2007-07-25 15:20:10.000000000 +0300
>> +++ net-2.6/drivers/net/bonding/bonding.h	2007-07-26 18:42:43.652087660 +0300
>> @@ -203,6 +203,7 @@ struct bonding {
>> 	struct   vlan_group *vlgrp;
>> 	struct   packet_type arp_mon_pt;
>> 	s8       do_set_mac_addr;
>> +	int	 send_grat_arp;
> 
> 	This need not be a full int, and (this applies to
> do_set_mac_addr, also) could probably be squeezed into gaps already
> existing within the struct bonding somewhere.
Thanks. Will be fixed.
> 
> 	-J
> 
> ---
> 	-Jay Vosburgh, IBM Linux Technology Center, fubar at us.ibm.com
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 





More information about the general mailing list