[ofa-general] [PATCH] infiniband-diags/src/saquery.c: fix potential core dump (Was: Re: infiniband-diags/saquery.c:print_multicast_member_record question)
Ira Weiny
weiny2 at llnl.gov
Tue Jun 10 13:18:03 PDT 2008
On Tue, 10 Jun 2008 11:35:19 -0700
Hal Rosenstock <hrosenstock at xsigo.com> wrote:
> On Tue, 2008-06-10 at 11:26 -0700, Ira Weiny wrote:
> > On Tue, 10 Jun 2008 11:16:24 -0700
> > Hal Rosenstock <hrosenstock at xsigo.com> wrote:
<snip>
> > >
> > > Looks to me like it relies on some node GUID being same as port GUID.
> > > While that's allowable for one port, it won't always be the case.
> >
> > In my test system the Node GUID and port GUID are different and this works. I
> > am specifically using the port guid out of the NodeInfo struct of the
> > NodeRecord. So I should be using the port guid vs port GID ID _only_.
>
> Not sure what you mean; port GID 0 = subnet prefix + port GUID. Not sure
> what SMs support other port GIDs than this but OpenSM doesn't.
I mean the port gid == subnet prefix + port GUID. Therefore if you mask off
the subnet prefix (ie only use the lower 64bits of the gid) you should end up
with the GUID, right?
>
> > > > Have you found some configuration in which this does not
> > > > work?
> > >
> > > Yes.
> >
> > :-( Sorry, I'm pretty sure this should be ok...
>
> It's possible it's an environment thing but there's a couple of
> suboptimialities in print_multicast_member_record:
That is probably true. I see you have already sent a patch to not do the node
record queries unless necessary.
>
> 1. Does osmv_get_query_node_rec always return a non NULL pointer ?
:-/ Yea that is a bug. Yes using node_record would be bad if for some reason
the node_record did not exists. However, the port should not be a member of
the mcast group if it does not exist in the SA. So... you should not have a
mcast member record.
I think there is a race condition here that I did not check, that is bad.
> 2. If the loop fails to find a match, the last node description is used.
Yep, that is a bug as well. I guess this masks the above situation. Would you
agree this is a corner case? Unless your fabric is changing quickly there
should be a node record for each member record right?
Patch below.
Should there be a warning printed in the case the node_record is not found?
Ira
>From ac10f52fd65b0b34b409ca2aa266a7363ae32e2c Mon Sep 17 00:00:00 2001
From: Ira K. Weiny <weiny2 at llnl.gov>
Date: Tue, 10 Jun 2008 13:07:43 -0700
Subject: [PATCH] infiniband-diags/src/saquery.c: fix potential core dump and/or incorrect node
descriptions from being printed if a corresponding node_record is not found for
a multicast member record.
Signed-off-by: Ira K. Weiny <weiny2 at llnl.gov>
---
infiniband-diags/src/saquery.c | 17 +++++++++++------
1 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/infiniband-diags/src/saquery.c b/infiniband-diags/src/saquery.c
index e727940..89754f6 100644
--- a/infiniband-diags/src/saquery.c
+++ b/infiniband-diags/src/saquery.c
@@ -354,14 +354,19 @@ print_multicast_member_record(ib_member_rec_t *p_mcmr)
uint64_t gid_prefix = cl_ntoh64( p_mcmr->port_gid.unicast.prefix );
uint64_t gid_interface_id = cl_ntoh64( p_mcmr->port_gid.unicast.interface_id );
uint16_t mlid = cl_ntoh16( p_mcmr->mlid );
- ib_node_record_t *node_record = NULL;
int i = 0;
+ char *node_name = "<unknown>";
- /* go through and find the node description for this node GID */
+ /* go through the node records searching for a port guid which matches
+ * this port gid interface id.
+ * This gives us a node name to print, if available.
+ */
for (i = 0; i < result.result_cnt; i++) {
- node_record = osmv_get_query_node_rec(result.p_result_madw, i);
- if (cl_ntoh64(node_record->node_info.port_guid) == gid_interface_id)
+ ib_node_record_t *nr = osmv_get_query_node_rec(result.p_result_madw, i);
+ if (cl_ntoh64(nr->node_info.port_guid) == gid_interface_id) {
+ node_name = clean_nodedesc((char *)nr->node_desc.description);
break;
+ }
}
if (requested_name) {
@@ -370,7 +375,7 @@ print_multicast_member_record(ib_member_rec_t *p_mcmr)
"0x%016" PRIx64 " (%s)\n",
gid_prefix,
gid_interface_id,
- clean_nodedesc((char *)node_record->node_desc.description)
+ node_name
);
}
} else {
@@ -391,7 +396,7 @@ print_multicast_member_record(ib_member_rec_t *p_mcmr)
gid_interface_id,
p_mcmr->scope_state,
p_mcmr->proxy_join,
- clean_nodedesc((char *)node_record->node_desc.description)
+ node_name
);
}
}
--
1.5.1
More information about the general
mailing list