[ofa-general] Re: [PATCH 8/8] Convert ibqueryerrors.pl to C and use new ibnetdisc library.

Ira Weiny weiny2 at llnl.gov
Mon Apr 27 14:50:26 PDT 2009


On Sat, 25 Apr 2009 18:54:41 +0300
Sasha Khapyorsky <sashak at voltaire.com> wrote:

> On 13:31 Thu 23 Apr     , Ira Weiny wrote:
> > diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c
> > new file mode 100644
> > index 0000000..9d96190
> > +#include <inttypes.h>

[snip]

> > +
> > +#include <infiniband/complib/cl_nodenamemap.h>
> 
> AFAIR WinOF doesn't like such inclusion and uses
> 
> #include <complib/filename.h>
> 
> instead. I'm changing.

Ok, sorry.

> 
> > +#include <infiniband/ibnetdisc.h>
> > +#include <infiniband/mad.h>
> > +
> > +#include "ibdiag_common.h"
> > +
> > +char *argv0 = "ibqueryerrors";
> 
> argv0 variable is not needed if you are using ibdiag_common stuff.
> Removing.

cool, that was left over sorry.

> 
> > +static FILE *f;
> 
> I don't see where 'f' is used (except 'f = stdout;' below). Removing.
> 
> [snip...]
> 
> > +static int process_opt(void *context, int ch, char *optarg)
> > +{
> > +	switch (ch) {
> > +	case 's':
> > +		calculate_suppressed_fields(optarg);
> > +		break;
> > +	case 'c':
> > +		/* Right now this is the only "common" error */
> > +		add_suppressed(IB_PC_ERR_SWITCH_REL_F);
> > +		break;
> > +	case 1:
> > +		node_name_map_file = strdup(optarg);
> > +		break;
> > +	case 2:
> > +		data_counters++;
> > +		break;
> > +	case 3:
> > +		all_nodes++;
> > +		break;
> > +	case 'S':
> > +		switch_guid_str = strdup(optarg);
> 
> Why should optarg be strdup()ed?

Well it does not have to be strdup'ed however, switch_guid_str needs to be set
for the call to ib_resolve_portid_str_via below.

...
   if (ib_resolve_portid_str_via(&portid, switch_guid_str, IB_DEST_GUID,
...

The removal of this line causes the '-S' option to segfault.  Patch to pq/ibn4
is below.


[snip]

> > +
> > +	if (switch_guid) {
> > +		/* limit the scan the fabric around the target */
> > +		ib_portid_t portid = {0};
> > +
> > +		if (ib_resolve_portid_str_via(&portid, switch_guid_str, IB_DEST_GUID,
> > +					ibd_sm_id, ibmad_port) < 0) {
> > +			fprintf(stderr, "can't resolve destination port %s %p\n",
> > +				switch_guid_str, ibd_sm_id);
> > +			rc = 1;
> > +			goto close_port;
> > +		}
> > +
> > +		if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, &portid, 1)) == NULL) {
> > +			fprintf(stderr, "discover failed\n");
> > +			rc = 1;
> > +			goto close_port;
> > +		}
> > +	} else {
> > +		if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, NULL, -1)) == NULL) {
> > +			fprintf(stderr, "discover failed\n");
> > +			rc = 1;
> > +			goto close_port;
> > +		}
> 
> Above you are using IBERROR(), here is fprintf(stderr, ...). Could it be
> consistent? (if yes - it is subsequent patch).

I used fprintf here to allow the goto to close the port, rather than let
IBERROR exit out.  We already discussed this on the list but this gives a
better example to users that they are to close the port.  I can change it if
you like.

> 
> > +	}
> > +
> > +	report_suppressed();
> > +
> > +	if (switch_guid) {
> > +		ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid);
> > +		print_node(node, NULL);
> > +	} else if (dr_path) {
> > +		ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path);
> > +		print_node(node, NULL);
> 
> When GUID or DR Path are specified we don't need to discover whole
> fabric, but can try to resolve LID using SA or querying PortInfo.
> 
> Although when in GUID is specified and SA is not responsive there is
> probably no other choice than discover.
> 

:-( good point.  Discovering only part of the fabric was a huge speed
improvement but if the resolve does not succeed I should do a full discover.

I will work up a separate patch.  Right now you are correct if the SA is
unresponsive the "-S" option will fail.  iblinkinfo does the full scan every
time.  But that slows down the query for a single switch to the same O(n)
query that a full system scan requires.  I would rather have that query be
O(1).  So I implemented ibqueryerrors in this manner with the intent of going
back and "fixing" iblinkinfo.  I think having a fall back on a full system
scan is a good idea.  Patch for both tools will follow...  :-D

Ira

From: Ira Weiny <weiny2 at llnl.gov>
Date: Mon, 27 Apr 2009 14:47:08 -0700
Subject: [PATCH] switch_guid_str is required for the string resolve function.


Signed-off-by: Ira Weiny <weiny2 at llnl.gov>
---
 infiniband-diags/src/ibqueryerrors.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c
index 52bd036..09861be 100644
--- a/infiniband-diags/src/ibqueryerrors.c
+++ b/infiniband-diags/src/ibqueryerrors.c
@@ -364,6 +364,7 @@ static int process_opt(void *context, int ch, char *optarg)
 		all_nodes++;
 		break;
 	case 'S':
+		switch_guid_str = optarg;
 		switch_guid = strtoull(optarg, 0, 0);
 		break;
 	case 'D':
-- 
1.5.4.5




More information about the general mailing list