[ofa-general] Re: [PATCH 8/8] Convert ibqueryerrors.pl to C and use new ibnetdisc library.
Ira Weiny
weiny2 at llnl.gov
Mon Apr 27 14:50:26 PDT 2009
On Sat, 25 Apr 2009 18:54:41 +0300
Sasha Khapyorsky <sashak at voltaire.com> wrote:
> On 13:31 Thu 23 Apr , Ira Weiny wrote:
> > diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c
> > new file mode 100644
> > index 0000000..9d96190
> > +#include <inttypes.h>
[snip]
> > +
> > +#include <infiniband/complib/cl_nodenamemap.h>
>
> AFAIR WinOF doesn't like such inclusion and uses
>
> #include <complib/filename.h>
>
> instead. I'm changing.
Ok, sorry.
>
> > +#include <infiniband/ibnetdisc.h>
> > +#include <infiniband/mad.h>
> > +
> > +#include "ibdiag_common.h"
> > +
> > +char *argv0 = "ibqueryerrors";
>
> argv0 variable is not needed if you are using ibdiag_common stuff.
> Removing.
cool, that was left over sorry.
>
> > +static FILE *f;
>
> I don't see where 'f' is used (except 'f = stdout;' below). Removing.
>
> [snip...]
>
> > +static int process_opt(void *context, int ch, char *optarg)
> > +{
> > + switch (ch) {
> > + case 's':
> > + calculate_suppressed_fields(optarg);
> > + break;
> > + case 'c':
> > + /* Right now this is the only "common" error */
> > + add_suppressed(IB_PC_ERR_SWITCH_REL_F);
> > + break;
> > + case 1:
> > + node_name_map_file = strdup(optarg);
> > + break;
> > + case 2:
> > + data_counters++;
> > + break;
> > + case 3:
> > + all_nodes++;
> > + break;
> > + case 'S':
> > + switch_guid_str = strdup(optarg);
>
> Why should optarg be strdup()ed?
Well it does not have to be strdup'ed however, switch_guid_str needs to be set
for the call to ib_resolve_portid_str_via below.
...
if (ib_resolve_portid_str_via(&portid, switch_guid_str, IB_DEST_GUID,
...
The removal of this line causes the '-S' option to segfault. Patch to pq/ibn4
is below.
[snip]
> > +
> > + if (switch_guid) {
> > + /* limit the scan the fabric around the target */
> > + ib_portid_t portid = {0};
> > +
> > + if (ib_resolve_portid_str_via(&portid, switch_guid_str, IB_DEST_GUID,
> > + ibd_sm_id, ibmad_port) < 0) {
> > + fprintf(stderr, "can't resolve destination port %s %p\n",
> > + switch_guid_str, ibd_sm_id);
> > + rc = 1;
> > + goto close_port;
> > + }
> > +
> > + if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, &portid, 1)) == NULL) {
> > + fprintf(stderr, "discover failed\n");
> > + rc = 1;
> > + goto close_port;
> > + }
> > + } else {
> > + if ((fabric = ibnd_discover_fabric(ibmad_port, ibd_timeout, NULL, -1)) == NULL) {
> > + fprintf(stderr, "discover failed\n");
> > + rc = 1;
> > + goto close_port;
> > + }
>
> Above you are using IBERROR(), here is fprintf(stderr, ...). Could it be
> consistent? (if yes - it is subsequent patch).
I used fprintf here to allow the goto to close the port, rather than let
IBERROR exit out. We already discussed this on the list but this gives a
better example to users that they are to close the port. I can change it if
you like.
>
> > + }
> > +
> > + report_suppressed();
> > +
> > + if (switch_guid) {
> > + ibnd_node_t *node = ibnd_find_node_guid(fabric, switch_guid);
> > + print_node(node, NULL);
> > + } else if (dr_path) {
> > + ibnd_node_t *node = ibnd_find_node_dr(fabric, dr_path);
> > + print_node(node, NULL);
>
> When GUID or DR Path are specified we don't need to discover whole
> fabric, but can try to resolve LID using SA or querying PortInfo.
>
> Although when in GUID is specified and SA is not responsive there is
> probably no other choice than discover.
>
:-( good point. Discovering only part of the fabric was a huge speed
improvement but if the resolve does not succeed I should do a full discover.
I will work up a separate patch. Right now you are correct if the SA is
unresponsive the "-S" option will fail. iblinkinfo does the full scan every
time. But that slows down the query for a single switch to the same O(n)
query that a full system scan requires. I would rather have that query be
O(1). So I implemented ibqueryerrors in this manner with the intent of going
back and "fixing" iblinkinfo. I think having a fall back on a full system
scan is a good idea. Patch for both tools will follow... :-D
Ira
From: Ira Weiny <weiny2 at llnl.gov>
Date: Mon, 27 Apr 2009 14:47:08 -0700
Subject: [PATCH] switch_guid_str is required for the string resolve function.
Signed-off-by: Ira Weiny <weiny2 at llnl.gov>
---
infiniband-diags/src/ibqueryerrors.c | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/infiniband-diags/src/ibqueryerrors.c b/infiniband-diags/src/ibqueryerrors.c
index 52bd036..09861be 100644
--- a/infiniband-diags/src/ibqueryerrors.c
+++ b/infiniband-diags/src/ibqueryerrors.c
@@ -364,6 +364,7 @@ static int process_opt(void *context, int ch, char *optarg)
all_nodes++;
break;
case 'S':
+ switch_guid_str = optarg;
switch_guid = strtoull(optarg, 0, 0);
break;
case 'D':
--
1.5.4.5
More information about the general
mailing list