[ofa-general] [PATCH] mthca: Do not allow ib userspace open following device internal error

Jack Morgenstein jackm at dev.mellanox.co.il
Wed Aug 12 02:15:46 PDT 2009


Userspace apps are supposed to release all ib device resources if
they receive a fatal async event (IBV_EVENT_DEVICE_FATAL).  However,
the app has no way of knowing when the device has come back up, except
to repeatedly attempt ibv_open_device() until it succeeds.

However, currently there is no protection against open succeeding when
the device is in the midst of the removal following the fatal event.
In this case, the open will succeed, but as a result the device waits
in the middle of its removal until the new app releases its ib resources
 -- and the new app will not do so, since the open succeeded at a point
following the fatal event generation.

This patch adds an "active" flag to the device. The active flag is set to
false (in the fatal event flow) before the "fatal" event is generated,
so any subsequent ibv_dev_open() call to the device will fail until the
device comes back up, thus preventing the above deadlock.

Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>

---
Roland,
You are right, mthca also needs such a patch.

This will prevent user-level apps from allocating a device context following
a device internal catastrophic error.

BTW, if the administrator has disabled device reset on fatal (by default, it is
enabled), user-apps will simply need to wait for admin intervention
(rmmod and insmod on low-level driver).  IMHO, this is OK -- following an internal
error, the device must be reset anyway, so there is no point in allowing new apps
to attempt to run.

diff --git a/drivers/infiniband/hw/mthca/mthca_catas.c b/drivers/infiniband/hw/mthca/mthca_catas.c
index 65ad359..ad8b26b 100644
--- a/drivers/infiniband/hw/mthca/mthca_catas.c
+++ b/drivers/infiniband/hw/mthca/mthca_catas.c
@@ -88,6 +88,7 @@ static void handle_catas(struct mthca_dev *dev)
 	event.device = &dev->ib_dev;
 	event.event  = IB_EVENT_DEVICE_FATAL;
 	event.element.port_num = 0;
+	dev->active = 0;
 
 	ib_dispatch_event(&event);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 9ef611f..c1e2bcb 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -357,6 +357,7 @@ struct mthca_dev {
 	struct ib_ah         *sm_ah[MTHCA_MAX_PORTS];
 	spinlock_t            sm_lock;
 	u8                    rate[MTHCA_MAX_PORTS];
+	int		      active;
 };
 
 #ifdef CONFIG_INFINIBAND_MTHCA_DEBUG
diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c
index 13da9f1..118a386 100644
--- a/drivers/infiniband/hw/mthca/mthca_main.c
+++ b/drivers/infiniband/hw/mthca/mthca_main.c
@@ -1116,6 +1116,8 @@ static int __mthca_init_one(struct pci_dev *pdev, int hca_type)
 	pci_set_drvdata(pdev, mdev);
 	mdev->hca_type = hca_type;
 
+	mdev->active = 1;
+
 	return 0;
 
 err_unregister:
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 87ad889..bcf7a40 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -334,6 +334,9 @@ static struct ib_ucontext *mthca_alloc_ucontext(struct ib_device *ibdev,
 	struct mthca_ucontext           *context;
 	int                              err;
 
+	if (!(to_mdev(ibdev)->active))
+		return ERR_PTR(-EAGAIN);
+
 	memset(&uresp, 0, sizeof uresp);
 
 	uresp.qp_tab_size = to_mdev(ibdev)->limits.num_qps;



More information about the general mailing list