[openib-general] [PATCH]proposal for enabling partial ports on HCA

Roland Dreier rolandd at cisco.com
Wed Oct 5 14:24:50 PDT 2005


    Shirley> It's necessary to modify the ib_mad, ib_sa, ib_cm, just
    Shirley> act like ib_ipoib and ib_cache to continue initializing
    Shirley> when one port encounting errors, instead of releasing all
    Shirley> resouces. If you agree, I am creating as the first patch
    Shirley> for review. How to handler the errors would be the second
    Shirley> patch.

I don't agree that we want to handle "half-usable" devices where some
ports don't work.  The only use for this seems to be working around
some problems with the current Galaxy HCA implementation, and there
must be a better way to handle this.

You're welcome to prove me wrong, but I think that handling ports that
are not usable and then become usable later is just going to be
horrible.  And if we do that, then I think it would make sense to
handle ports starting out usable and then becoming unusable later --
and I think that's going to be even worse still.

I do agree that we want to handle errors in initialization better.
The ib_mad and ib_cm code actually looks OK to me (with a small bug in
ib_mad for which I'll post a patch shortly).  I think something like
the patch below is all that's needed to fix ib_sa:

--- infiniband/core/sa_query.c	(revision 3664)
+++ infiniband/core/sa_query.c	(working copy)
@@ -583,10 +583,16 @@ int ib_sa_path_rec_get(struct ib_device 
 {
 	struct ib_sa_path_query *query;
 	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
-	struct ib_sa_port   *port   = &sa_dev->port[port_num - sa_dev->start_port];
-	struct ib_mad_agent *agent  = port->agent;
+	struct ib_sa_port   *port;
+	struct ib_mad_agent *agent;
 	int ret;
 
+	if (!sa_dev)
+		return -ENODEV;
+
+	port  = &sa_dev->port[port_num - sa_dev->start_port];
+	agent = port->agent;
+
 	query = kmalloc(sizeof *query, gfp_mask);
 	if (!query)
 		return -ENOMEM;
@@ -685,10 +691,16 @@ int ib_sa_service_rec_query(struct ib_de
 {
 	struct ib_sa_service_query *query;
 	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
-	struct ib_sa_port   *port   = &sa_dev->port[port_num - sa_dev->start_port];
-	struct ib_mad_agent *agent  = port->agent;
+	struct ib_sa_port   *port;
+	struct ib_mad_agent *agent;
 	int ret;
 
+	if (!sa_dev)
+		return -ENODEV;
+
+	port  = &sa_dev->port[port_num - sa_dev->start_port];
+	agent = port->agent;
+
 	if (method != IB_MGMT_METHOD_GET &&
 	    method != IB_MGMT_METHOD_SET &&
 	    method != IB_SA_METHOD_DELETE)
@@ -768,10 +780,16 @@ int ib_sa_mcmember_rec_query(struct ib_d
 {
 	struct ib_sa_mcmember_query *query;
 	struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client);
-	struct ib_sa_port   *port   = &sa_dev->port[port_num - sa_dev->start_port];
-	struct ib_mad_agent *agent  = port->agent;
+	struct ib_sa_port   *port;
+	struct ib_mad_agent *agent;
 	int ret;
 
+	if (!sa_dev)
+		return -ENODEV;
+
+	port  = &sa_dev->port[port_num - sa_dev->start_port];
+	agent = port->agent;
+
 	query = kmalloc(sizeof *query, gfp_mask);
 	if (!query)
 		return -ENOMEM;



More information about the general mailing list