[openib-general] [PATCH] fmr support in mthca

Michael S. Tsirkin mst at mellanox.co.il
Fri Mar 18 00:44:55 PST 2005


Quoting r. Roland Dreier <roland at topspin.com>:
> Subject: Re: [openib-general] [PATCH] fmr support in mthca
> 
> Thanks for implementing this.  You saved me a lot of work and Libor
> will be very happy when he gets back next week.

Good, glad to help. I will try to address your comments next week
(its already weekend here).

> Some comments from my first read through:
> 
>  > This patch implements FMR support. I also rolled into it two fixes
>  > for regular mrs that I posed previously, let me know if its a problem.
> 
> No problem although I'll apply them separately.
> 
>  > This seems to be working fine for me, although I only did relatively basic
>  > tests. Both Tavor and Arbel Native modes are supported. I made some tradeoffs
>  > for simplicity, let me know what do you think:
>  > - for tavor, I keep for each fmr two pages mapped: for mpt and one for
>  >   mtt access. This spends more kernel virtual memory than could be used,
>  >   since many mpts could share one page. Alternatives are:
>  >   map/unmap io memory on each fmr map/unmap request, or
>  >   keep and intermediate table tomap each page only once.
> 
> I don't think this is acceptable.  Each ioremap has to map at least
> one page plus a guard page.  With two ioremaps per FMR, every FMR is
> using 16K (or more) of vmalloc space.  On 64 bit archs, this doesn't
> matter, but on a large memory i386 machine, there's less than 128 MB
> of vmalloc space available (possibly a lot less if someone is using a
> video card with a big frame buffer or something).  That means we're
> limited to a few thousand FMRs, which isn't enough.
> 
> What if we just reserve something like 64K MPTs and MTTs for FMRs and
> ioremap everything at driver startup?  That would only use a few MB of
> vmalloc space and probably simplify the code too.

I dont like these pre-allocations - if someone is only using SDP and IP
over IB, it seems he wont need almost any regular regions.
64K MTTs with 4K page size cover up to 200MByte of memory.

My other problem with this approach was implementational: existing allocator
and table code can be passed reserved parameter, but dont have the ability
to allocate out of that pool. So we'd have to allocate out of a separate
allocator, and take care so that keys do not conflict. This gets a bit
complicated.

Maybe do something separate for 32 bit kernels (like - disable FMR
support)?

>  > - icm that has the mpts/mtts is linearly scanned and this is repeated
>  >   for each mtt on each fmr map. This may be improved somewhat
>  >   with some kind of an iterator, but to really speed things up
>  >   the icm data structure (list of arrays) would have to
>  >   be replaced by some kind of tree.
> 
> I don't understand this.  I'm probably missing something but the
> addresses don't change after we allocate the FMR, right?  It seems we
> could just store the MPT/MTT address in the FMR structure the same way
> we do for Tavor mode.

Yes but for mtts the addresses may not be physically contigious,
unless we want to limit FMRs to PAGE_SIZE/8 MTTs, which means
512 MTTs, that is 2MByte with 4K FMR page size.
And is it seems possible that even with this limitation MTTs for a
specific FMR start at non page aligned boundary.

So we'd need an array of pages per FMR, unlike Tavor.
Do you think its a good idea?

> Some more nitpicky comments below...
> 
>  > +/* Nonblocking. Callers must make sure the object exists by serializing against
>  > + * callers of get/put. */
>  > +void *mthca_table_find(struct mthca_dev *dev, struct mthca_icm_table *table,
>  > +		       int obj);
> 
> Can we just make this use the table mutex and only call it when
> allocating an FMR?

See above. But the restriction doesnt matter much for FMRs
because the icm ref count is incremented when FMR is created,
so they satisfy this constraint.

Other comments need to be addressed. I'll start working on them
when I am back on Sunday.

-- 
MST - Michael S. Tsirkin

From: Hal Rosenstock <halr at voltaire.com>
To: openib-general at openib.org
Content-Type: text/plain
Organization: 
Message-Id: <1111152373.4662.6585.camel at localhost.localdomain>
Mime-Version: 1.0
X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) 
Date: 18 Mar 2005 08:26:14 -0500
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: by amavisd-new at voltaire.com
X-Spam-Checker-Version: SpamAssassin 2.64 (2004-01-11) on openib.ca.sandia.gov
X-Spam-Level: **
X-Spam-Status: No, hits=2.9 required=5.0 tests=DOMAIN_BODY,
	REMOVE_REMOVAL_NEAR autolearn=no version=2.64
Subject: [openib-general] [PATCH] ping Add IB ping server agent
X-BeenThere: openib-general at openib.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: OpenIB General Mailing List <openib-general.openib.org>
List-Unsubscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=unsubscribe>
List-Archive: <http://openib.org/pipermail/openib-general>
List-Post: <mailto:openib-general at openib.org>
List-Help: <mailto:openib-general-request at openib.org?subject=help>
List-Subscribe: <http://openib.org/mailman/listinfo/openib-general>,
	<mailto:openib-general-request at openib.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Mar 2005 13:33:49 -0000

ping: Add IB ping server agent as a separate module
(used with ibping diagnostic tool)

Signed-off-by: Shahar Frank <shaharf at voltaire.com>
Signed-off-by: Hal Rosenstock <halr at voltaire.com>

Index: ping.h
===================================================================
--- ping.h	(revision 0)
+++ ping.h	(revision 0)
@@ -0,0 +1,49 @@
+/*
+ * Copyright (c) 2004, 2005 Mellanox Technologies Ltd.  All rights reserved.
+ * Copyright (c) 2004, 2005 Infinicon Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Topspin Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ * $Id$
+ */
+
+#ifndef __PING_H_
+#define __PING_H_
+
+extern spinlock_t ib_ping_port_list_lock;
+
+extern int ib_ping_port_open(struct ib_device *device,
+			     int port_num);
+
+extern int ib_ping_port_close(struct ib_device *device, int port_num);
+
+#endif	/* __PING_H_ */
Index: ping_priv.h
===================================================================
--- ping_priv.h	(revision 0)
+++ ping_priv.h	(revision 0)
@@ -0,0 +1,61 @@
+/*
+ * Copyright (c) 2004, 2005 Mellanox Technologies Ltd.  All rights reserved.
+ * Copyright (c) 2004, 2005 Infinicon Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Topspin Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ * $Id$
+ */
+
+#ifndef __IB_PING_PRIV_H__
+#define __IB_PING_PRIV_H__
+
+#include <linux/pci.h>
+
+#define SPFX "ib_ping: "
+
+struct ib_ping_send_wr {
+	struct list_head send_list;
+	struct ib_ah *ah;
+	struct ib_mad_private *mad;
+	DECLARE_PCI_UNMAP_ADDR(mapping)
+};
+
+struct ib_ping_port_private {
+	struct list_head port_list;
+	struct list_head send_posted_list;
+	spinlock_t send_list_lock;
+	int port_num;
+	struct ib_mad_agent *pingd_agent;     /* OpenIB Ping class */
+};
+
+#endif	/* __IB_PING_PRIV_H__ */
Index: ping.c
===================================================================
--- ping.c	(revision 0)
+++ ping.c	(revision 0)
@@ -0,0 +1,425 @@
+/*
+ * Copyright (c) 2004, 2005 Mellanox Technologies Ltd.  All rights reserved.
+ * Copyright (c) 2004, 2005 Infinicon Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Topspin Corporation.  All rights reserved.
+ * Copyright (c) 2004, 2005 Voltaire Corporation.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ * $Id$
+ */
+
+#include <linux/dma-mapping.h>
+#include <linux/utsname.h>
+#include <asm/bug.h>
+
+#include "ping_priv.h"
+#include "mad_priv.h"
+#include "ping.h"
+
+spinlock_t ib_ping_port_list_lock;
+static LIST_HEAD(ib_ping_port_list);
+
+/*
+ * Caller must hold ib_ping_port_list_lock
+ */
+static inline struct ib_ping_port_private *
+__ib_get_ping_port(struct ib_device *device, int port_num,
+		   struct ib_mad_agent *mad_agent)
+{
+	struct ib_ping_port_private *entry;
+
+	BUG_ON(!(!!device ^ !!mad_agent));  /* Exactly one MUST be (!NULL) */
+
+	if (device) {
+		list_for_each_entry(entry, &ib_ping_port_list, port_list) {
+			if (entry->pingd_agent->device == device &&
+			    entry->port_num == port_num)
+				return entry;
+		}
+	} else {
+		list_for_each_entry(entry, &ib_ping_port_list, port_list) {
+			if (entry->pingd_agent == mad_agent)
+				return entry;
+		}
+	}
+	return NULL;
+}
+
+static inline struct ib_ping_port_private *
+ib_get_ping_port(struct ib_device *device, int port_num,
+		 struct ib_mad_agent *mad_agent)
+{
+	struct ib_ping_port_private *entry;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ib_ping_port_list_lock, flags);
+	entry = __ib_get_ping_port(device, port_num, mad_agent);
+	spin_unlock_irqrestore(&ib_ping_port_list_lock, flags);
+
+	return entry;
+}
+
+static int ping_mad_send(struct ib_mad_agent *mad_agent,
+			 struct ib_ping_port_private *port_priv,
+			 struct ib_mad_private *mad_priv,
+			 struct ib_grh *grh,
+			 struct ib_wc *wc)
+{
+	struct ib_ping_send_wr *ping_send_wr;
+	struct ib_sge gather_list;
+	struct ib_send_wr send_wr;
+	struct ib_send_wr *bad_send_wr;
+	struct ib_ah_attr ah_attr;
+	unsigned long flags;
+	int ret = 1;
+
+	ping_send_wr = kmalloc(sizeof(*ping_send_wr), GFP_KERNEL);
+	if (!ping_send_wr)
+		goto out;
+	ping_send_wr->mad = mad_priv;
+
+	/* PCI mapping */
+	gather_list.addr = dma_map_single(mad_agent->device->dma_device,
+					  &mad_priv->mad,
+					  sizeof(mad_priv->mad),
+					  DMA_TO_DEVICE);
+	gather_list.length = sizeof(mad_priv->mad);
+	gather_list.lkey = mad_agent->mr->lkey;
+
+	send_wr.next = NULL;
+	send_wr.opcode = IB_WR_SEND;
+	send_wr.sg_list = &gather_list;
+	send_wr.num_sge = 1;
+	send_wr.wr.ud.remote_qpn = wc->src_qp; /* DQPN */
+	send_wr.wr.ud.timeout_ms = 0;
+	send_wr.send_flags = IB_SEND_SIGNALED | IB_SEND_SOLICITED;
+
+	ah_attr.dlid = wc->slid;
+	ah_attr.port_num = mad_agent->port_num;
+	ah_attr.src_path_bits = wc->dlid_path_bits;
+	ah_attr.sl = wc->sl;
+	ah_attr.static_rate = 0;
+	ah_attr.ah_flags = 0; /* No GRH */
+	if (mad_priv->mad.mad.mad_hdr.mgmt_class == IB_MGMT_CLASS_OPENIB_PING) {
+		if (wc->wc_flags & IB_WC_GRH) {
+			ah_attr.ah_flags = IB_AH_GRH;
+			/* Should sgid be looked up ? */
+			ah_attr.grh.sgid_index = 0;
+			ah_attr.grh.hop_limit = grh->hop_limit;
+			ah_attr.grh.flow_label = be32_to_cpup(
+				&grh->version_tclass_flow)  & 0xfffff;
+			ah_attr.grh.traffic_class = (be32_to_cpup(
+				&grh->version_tclass_flow) >> 20) & 0xff;
+			memcpy(ah_attr.grh.dgid.raw,
+			       grh->sgid.raw,
+			       sizeof(ah_attr.grh.dgid));
+		}
+	} else {
+		printk(KERN_ERR SPFX "Not OpenIB ping class 0x%x\n",
+		       mad_priv->mad.mad.mad_hdr.mgmt_class);
+		kfree(ping_send_wr);
+		goto out;
+	}
+
+	ping_send_wr->ah = ib_create_ah(mad_agent->qp->pd, &ah_attr);
+	if (IS_ERR(ping_send_wr->ah)) {
+		printk(KERN_ERR SPFX "No memory for address handle\n");
+		kfree(ping_send_wr);
+		goto out;
+	}
+
+	send_wr.wr.ud.ah = ping_send_wr->ah;
+	send_wr.wr.ud.pkey_index = wc->pkey_index;
+	send_wr.wr.ud.remote_qkey = IB_QP1_QKEY;
+	send_wr.wr.ud.mad_hdr = &mad_priv->mad.mad.mad_hdr;
+	send_wr.wr_id = (unsigned long)ping_send_wr;
+
+	pci_unmap_addr_set(ping_send_wr, mapping, gather_list.addr);
+
+	/* Send */
+	spin_lock_irqsave(&port_priv->send_list_lock, flags);
+	if (ib_post_send_mad(mad_agent, &send_wr, &bad_send_wr)) {
+		spin_unlock_irqrestore(&port_priv->send_list_lock, flags);
+		dma_unmap_single(mad_agent->device->dma_device,
+				 pci_unmap_addr(ping_send_wr, mapping),
+				 sizeof(mad_priv->mad),
+				 DMA_TO_DEVICE);
+		ib_destroy_ah(ping_send_wr->ah);
+		kfree(ping_send_wr);
+	} else {
+		list_add_tail(&ping_send_wr->send_list,
+			      &port_priv->send_posted_list);
+		spin_unlock_irqrestore(&port_priv->send_list_lock, flags);
+		ret = 0;
+	}
+
+out:
+	return ret;
+}
+
+static void pingd_recv_handler(struct ib_mad_agent *mad_agent,
+			       struct ib_mad_recv_wc *mad_recv_wc)
+{
+	struct ib_ping_port_private	*port_priv;
+	struct ib_vendor_mad	*vend;
+	struct ib_mad_private *recv = container_of(mad_recv_wc,
+					struct ib_mad_private,
+					header.recv_wc);
+
+	/* Find matching MAD agent */
+	port_priv = ib_get_ping_port(NULL, 0, mad_agent);
+	if (!port_priv) {
+		kmem_cache_free(ib_mad_cache, recv);
+		printk(KERN_ERR SPFX "pingd_recv_handler: no matching MAD "
+		       "agent %p\n", mad_agent);
+		return;
+	}
+
+	vend = (struct ib_vendor_mad *)mad_recv_wc->recv_buf.mad;
+
+	vend->mad_hdr.method |= IB_MGMT_METHOD_RESP;
+	vend->mad_hdr.status = 0;
+	if (!system_utsname.domainname[0])
+		strncpy(vend->data, system_utsname.nodename, sizeof vend->data);
+	else
+		snprintf(vend->data, sizeof vend->data, "%s.%s",
+			system_utsname.nodename, system_utsname.domainname);
+
+	/* Send response */
+	if (ping_mad_send(mad_agent, port_priv, recv,
+			  mad_recv_wc->recv_buf.grh, mad_recv_wc->wc)) {
+		kmem_cache_free(ib_mad_cache, recv);
+		printk(KERN_ERR SPFX "pingd_recv_handler: reply failed\n");
+	}
+}
+
+static void pingd_send_handler(struct ib_mad_agent *mad_agent,
+			       struct ib_mad_send_wc *mad_send_wc)
+{
+	struct ib_ping_port_private	*port_priv;
+	struct ib_ping_send_wr		*ping_send_wr;
+	unsigned long			flags;
+
+	/* Find matching MAD agent */
+	port_priv = ib_get_ping_port(NULL, 0, mad_agent);
+	if (!port_priv) {
+		printk(KERN_ERR SPFX "pingd_send_handler: no matching MAD "
+		       "agent %p\n", mad_agent);
+		return;
+	}
+
+	ping_send_wr = (struct ib_ping_send_wr *)(unsigned long)mad_send_wc->wr_id;
+	spin_lock_irqsave(&port_priv->send_list_lock, flags);
+	/* Remove completed send from posted send MAD list */
+	list_del(&ping_send_wr->send_list);
+	spin_unlock_irqrestore(&port_priv->send_list_lock, flags);
+
+	/* Unmap PCI */
+	dma_unmap_single(mad_agent->device->dma_device,
+			 pci_unmap_addr(ping_send_wr, mapping),
+			 sizeof(ping_send_wr->mad->mad),
+			 DMA_TO_DEVICE);
+
+	ib_destroy_ah(ping_send_wr->ah);
+
+	/* Release allocated memory */
+	kmem_cache_free(ib_mad_cache, ping_send_wr->mad);
+	kfree(ping_send_wr);
+}
+
+int ib_ping_port_open(struct ib_device *device, int port_num)
+{
+	int ret;
+	struct ib_ping_port_private *port_priv;
+	struct ib_mad_reg_req pingd_reg_req;
+	unsigned long flags;
+
+	/* First, check if port already open */
+	port_priv = ib_get_ping_port(device, port_num, NULL);
+	if (port_priv) {
+		printk(KERN_DEBUG SPFX "%s port %d already open\n",
+		       device->name, port_num);
+		return 0;
+	}
+
+	/* Create new device info */
+	port_priv = kmalloc(sizeof *port_priv, GFP_KERNEL);
+	if (!port_priv) {
+		printk(KERN_ERR SPFX "No memory for ib_ping_port_private\n");
+		ret = -ENOMEM;
+		goto error1;
+	}
+
+	memset(port_priv, 0, sizeof *port_priv);
+	port_priv->port_num = port_num;
+	spin_lock_init(&port_priv->send_list_lock);
+	INIT_LIST_HEAD(&port_priv->send_posted_list);
+
+	pingd_reg_req.mgmt_class = IB_MGMT_CLASS_OPENIB_PING;
+	pingd_reg_req.mgmt_class_version = 1;
+	pingd_reg_req.oui[0] = (IB_OPENIB_OUI >> 16) & 0xff;
+	pingd_reg_req.oui[1] = (IB_OPENIB_OUI >> 8) & 0xff;
+	pingd_reg_req.oui[2] = IB_OPENIB_OUI & 0xff;
+	set_bit(IB_MGMT_METHOD_GET, pingd_reg_req.method_mask);
+
+	/* Obtain server MAD agent for OpenIB Ping class (GSI QP) */
+	port_priv->pingd_agent = ib_register_mad_agent(device, port_num,
+						       IB_QPT_GSI,
+						      &pingd_reg_req, 0,
+						      &pingd_send_handler,
+						      &pingd_recv_handler,
+						       NULL);
+	if (IS_ERR(port_priv->pingd_agent)) {
+		ret = PTR_ERR(port_priv->pingd_agent);
+		goto error2;
+	}
+
+	spin_lock_irqsave(&ib_ping_port_list_lock, flags);
+	list_add_tail(&port_priv->port_list, &ib_ping_port_list);
+	spin_unlock_irqrestore(&ib_ping_port_list_lock, flags);
+
+	return 0;
+
+error2:
+	kfree(port_priv);
+error1:
+	return ret;
+}
+
+int ib_ping_port_close(struct ib_device *device, int port_num)
+{
+	struct ib_ping_port_private *port_priv;
+	unsigned long flags;
+
+	spin_lock_irqsave(&ib_ping_port_list_lock, flags);
+	port_priv = __ib_get_ping_port(device, port_num, NULL);
+	if (port_priv == NULL) {
+		spin_unlock_irqrestore(&ib_ping_port_list_lock, flags);
+		printk(KERN_ERR SPFX "Port %d not found\n", port_num);
+		return -ENODEV;
+	}
+	list_del(&port_priv->port_list);
+	spin_unlock_irqrestore(&ib_ping_port_list_lock, flags);
+
+	ib_unregister_mad_agent(port_priv->pingd_agent);
+	kfree(port_priv);
+
+	return 0;
+}
+
+static void ib_ping_init_device(struct ib_device *device)
+{
+	int ret, num_ports, cur_port, i, ret2;
+
+	if (device->node_type == IB_NODE_SWITCH) {
+		num_ports = 1;
+		cur_port = 0;
+	} else {
+		num_ports = device->phys_port_cnt;
+		cur_port = 1;
+	}
+
+	for (i = 0; i < num_ports; i++, cur_port++) {
+		ret = ib_ping_port_open(device, cur_port);
+		if (ret) {
+			printk(KERN_ERR SPFX "Couldn't open %s port %d\n",
+			       device->name, cur_port);
+			goto error_device_open;
+		}
+	}
+	goto error_device_query;
+
+error_device_open:
+	while (i > 0) {
+		cur_port--;
+		ret2 = ib_ping_port_close(device, cur_port);
+		if (ret2) {
+			printk(KERN_ERR PFX "Couldn't close %s port %d "
+			       "for ping agent\n",
+			       device->name, cur_port);
+		}
+		i--;
+	}
+
+error_device_query:
+	return;
+}
+
+static void ib_ping_remove_device(struct ib_device *device)
+{
+	int ret = 0, i, num_ports, cur_port, ret2;
+
+	if (device->node_type == IB_NODE_SWITCH) {
+		num_ports = 1;
+		cur_port = 0;
+	} else {
+		num_ports = device->phys_port_cnt;
+		cur_port = 1;
+	}
+	for (i = 0; i < num_ports; i++, cur_port++) {
+		ret2 = ib_ping_port_close(device, cur_port);
+		if (ret2) {
+			printk(KERN_ERR SPFX "Couldn't close %s port %d "
+			       "for ping agent\n",
+			       device->name, cur_port);
+			if (!ret)
+				ret = ret2;
+		}
+	}
+}
+
+static struct ib_client ping_client = {
+        .name   = "ping",
+        .add = ib_ping_init_device,
+        .remove = ib_ping_remove_device
+};
+
+static int __init ib_ping_init_module(void)
+{
+	spin_lock_init(&ib_ping_port_list_lock);
+	INIT_LIST_HEAD(&ib_ping_port_list);
+
+	if (ib_register_client(&ping_client)) {
+		printk(KERN_ERR SPFX "Couldn't register ib_ping client\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void __exit ib_ping_cleanup_module(void)
+{
+	ib_unregister_client(&ping_client);
+}
+
+module_init(ib_ping_init_module)
+module_exit(ib_ping_cleanup_module)
+
Index: mad.c
===================================================================
--- mad.c	(revision 2023)
+++ mad.c	(working copy)
@@ -45,6 +45,8 @@
 
 
 kmem_cache_t *ib_mad_cache;
+EXPORT_SYMBOL(ib_mad_cache);
+
 static struct list_head ib_mad_port_list;
 static u32 ib_mad_client_id = 0;
 
Index: Makefile
===================================================================
--- Makefile	(revision 2023)
+++ Makefile	(working copy)
@@ -1,12 +1,15 @@
 EXTRA_CFLAGS += -Idrivers/infiniband/include
 
-obj-$(CONFIG_INFINIBAND) +=	ib_core.o ib_mad.o ib_cm.o ib_sa.o ib_umad.o
+obj-$(CONFIG_INFINIBAND) +=	ib_core.o ib_mad.o ib_ping.o \
+				ib_cm.o ib_sa.o ib_umad.o
 
 ib_core-y :=			packer.o ud_header.o verbs.o sysfs.o \
 				device.o fmr_pool.o cache.o
 
 ib_mad-y :=			mad.o smi.o agent.o
 
+ib_ping-y :=			ping.o
+
 ib_cm-y :=			cm.o
 
 ib_sa-y :=			sa_query.o



More information about the general mailing list