[openib-general] Re: [PATCH] sdp: replace ip_dev_find with dev_base scan (was Re: ip_dev_find resolution?)

Michael S. Tsirkin mst at mellanox.co.il
Mon Dec 12 21:06:57 PST 2005


Quoting r. Sean Hefty <mshefty at ichips.intel.com>:
> Subject: Re: [openib-general] Re: [PATCH] sdp: replace ip_dev_find with dev_base scan (was Re: ip_dev_find resolution?)
> 
> Michael S. Tsirkin wrote:
> >>>But now we only do the list walk if the ip route resolution returns
> >>>a loopback device, so we are ok, right?
> >>
> >>That's the way it seems to me (at least for SDP but not CMA (addr)).
> > 
> > OK, Sean, do you want to cook up a patch for CMA based on this code of mine?
> > I could do it too but I dont have a way to test cma yet.
> 
> I need to read back over this thread; I didn't follow it the first time.  (At 
> this point, I think that it makes more sense to convert SDP to use the CMA or 
> ib_addr, rather than duplicating their functionality.)

Sure, this was just a test to show how to get rid of ip_dev_find.
I didnt follow CMA recently - does CMA already support SDP private format?
If yes, I need to work on moving SDP to use it.

> There are both kernel and userspace test programs (cmatose) for the CMA checked 
> into the tree.
> 
> ip_addr can use a different approach than ip_dev_find, but I think it makes 
> sense to use existing kernel functionality wherever possible.  If ip_dev_find 
> cannot be modified to support IPv6 addresses,

That's my assumption.

> Then how about adding a API that 
> is similar, but takes a pointer to an address, along with an indication of the 
> address family that's used?
> 
> I can work on a patch for this.
> 
> - Sean

ip_dev_find isnt ideal for our purposes either.

As a reminder, what we are trying to do is handle the loopback case,
where IP route resolution simply gives us back the loopback device, but
we want to use the IB loopback: either external if source/destination
are specified, or external if not.

Take a look at tryaddrmatch in the SDP patch I've sent (reposting here
for convenience), I think it might be a good start functionality-wise.

Here's what it does:

+static int tryaddrmatch(struct net_device *dev, u32 s_addr, u32 d_addr)
+{
+	struct in_ifaddr **ifap;
+	struct in_ifaddr *ifa;
+	struct in_device *in_dev;
+	int rc = -ENETUNREACH;
+	__be32 addr;
+
+	if (dev->type != ARPHRD_INFINIBAND)
+		return rc;

Look for net device of a given hardware type only.

+
+	in_dev = in_dev_get(dev);
+	if (!in_dev)
+		return rc;

And that supports the given address family.

+
+	addr = (ZERONET(s_addr) || LOOPBACK(s_addr)) ? d_addr : s_addr;

If source address is for a loopback device, select by destination address.

+
+	/* Hack to enable using SDP on addresses such as 127.0.0.1 */
+	if (ZERONET(addr) || LOOPBACK(addr)) {

If destination address is for loopback as well, select any device
of appropriate type that is up.

+		rc = (dev->flags & IFF_UP) ? 0 : -ENETUNREACH;
+		goto done;
+	}
+
+	for (ifap = &in_dev->ifa_list; (ifa = *ifap); ifap = &ifa->ifa_next) {
+		if (s_addr == ifa->ifa_address) {

Otherwise, look for device with the appropriate IP address.

+			rc = 0;
+			break; /* found */
+		}
+	}
+
+done:
+	in_dev_put(in_dev);
+	return rc;
+}
+
 /*
  * do_link_path_lookup - resolve an ip address to a path record
  */
@@ -406,17 +441,9 @@ static void do_link_path_lookup(struct s
 		     rt->u.dst.neighbour->dev->name,
 		     rt->rt_src, rt->rt_dst, rt->rt_gateway,
 		     rt->u.dst.neighbour->nud_state);
-	/*
-	 * device needs to be a valid IB device. Check for loopback.
-	 * In case of loopback find a valid IB device on which to
-	 * direct the loopback traffic.
-	 */
-	if (rt->u.dst.neighbour->dev->flags & IFF_LOOPBACK)
-		dev = ip_dev_find(rt->rt_src);
-	else {
-		dev = rt->u.dst.neighbour->dev;
-		dev_hold(dev);
-	}
+
+	dev = rt->u.dst.neighbour->dev;
+	dev_hold(dev);
 
 	/*
 	 * check for IB device or loopback, the later requires extra
@@ -433,13 +460,11 @@ static void do_link_path_lookup(struct s
 	if (dev->flags & IFF_LOOPBACK) {
 		dev_put(dev);
 		read_lock(&dev_base_lock);
-		for (dev = dev_base; dev; dev = dev->next) {
-			if (dev->type == ARPHRD_INFINIBAND &&
-			    (dev->flags & IFF_UP)) {
+		for (dev = dev_base; dev; dev = dev->next)
+			if (!tryaddrmatch(dev, rt->rt_src, rt->rt_dst)) {
 				dev_hold(dev);
 				break;
 			}
-		}
 		read_unlock(&dev_base_lock);
 	}
 


-- 
MST



More information about the general mailing list