[ewg] ibdiagpath broken with TCL 8.5

Mike Heinz michael.heinz at qlogic.com
Tue Mar 1 13:13:10 PST 2011


YK,

I had a chance to go back and dig further into this. I just scratch-built the ibis executable on an RHEL6 system, and started running it in interactive mode. What I see is that results that return arrays are getting garbage pre-pended to them - it looks like the root problem that John tried to patch last fall, and that's causing problems for some of my systems here, is that ibis isn't interfacing with TCL 8.5 correctly:

% puts [smLftBlockMad dump]
-lft 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
% puts [smVlArbTableMad dump]
-vl_entry {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00}

I do not see this behavior on systems running TCL 8.4:

% ibis_init
0
% ibis_set_port 0x00066a00a000707f
0
% puts [smLftBlockMad dump]
0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
% puts [smVlArbTableMad dump]
{0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00} {0x0 0x00}

> -----Original Message-----
> From: ewg-bounces at lists.openfabrics.org [mailto:ewg-
> bounces at lists.openfabrics.org] On Behalf Of Mike Heinz
> Sent: Monday, February 21, 2011 11:55 AM
> To: kliteyn at dev.mellanox.co.il
> Cc: Linux RDMA; ewg at lists.openfabrics.org
> Subject: Re: [ewg] Patch breaks OFED 1.5.3: [PATCH] ibdiagpath:
> Properly index VlArbTable during QoS test
>
> YK,
>
> I just finished running an RC4 build on Redhat 6. I didn't get the same
> error - but ibdiagpath still failed:
>
> [root at ifs004 1]# ibdiagpath -l 0x1,0x2
> Loading IBDIAGPATH from: /usr/lib64/ibdiagpath1.5.6
> -W- Topology file is not specified.
>     Reports regarding cluster links will use direct routes.
> Loading IBDM from: /usr/lib64/ibdm1.5.6
> -I- Using port 1 as the local port.
>
> -I---------------------------------------------------
> -I- Traversing the path from local to source
> -I---------------------------------------------------
>
> -I---------------------------------------------------
> -I- Traversing the path from source to destination
> -I---------------------------------------------------
> -I- From: lid=0x0001 guid=0x001175000078aca6 dev=29474 ifs004/P1
> -I- To:   lid=0x0003 guid=0x00066a01e5000108 dev=29472 Port=8
>
> -I- From: lid=0x0003 guid=0x00066a01e5000108 dev=29472 Port=8
> -I- To:   lid=0x0001 guid=0x001175000078aca6 dev=29474 ifs004/P1
>
> can't read "PATH(1)": no such element in array
> [root at ifs004 1]#
>
>
> The problem appears to be occurring in this code fragment:
>
>         if {[info exists NODE]} {
>             for {set i 0} {$i < [llength [array names NODE
> *,PortGUID]]} {incr i} {
>                 set portGuid $NODE($i,PortGUID)
>                 set nodeGuid $G(data:NodeGuid.$portGuid)
>                 if {$i % 2} {
>                 set portNum $NODE($i,EntryPort)
>                 } else {
>                     set portNum [lindex [split $PATH([expr $i + 1]) ,]
> end] << -- Bug here. Line 2381, ibdebug_if.tcl
>                 }
>                 lappend CSV_ERRORS
> $CSV_scope,$nodeGuid,$portGuid,$portNum,$desc,$msgBody,$CSV_severity,$e
> xid,$err_type
>             }
>         } else {
>             lappend CSV_ERRORS
> $CSV_scope,$nodeGuid,$portGuid,$portNum,$desc,$msgBody,$CSV_severity,$e
> xid,$err_type
>         }
>     }
>
> I don't know if it matters, but I'm testing with a one-port HCA. I
> added a puts in the offending code and got this:
>
> MHEINZ: i = 0. PATH(0) = 1
> can't read "PATH(1)": no such element in array
>
> Please let me know if there are any tests I can run for you.
>
> -----Original Message-----
> From: Mike Heinz
> Sent: Monday, February 21, 2011 10:40 AM
> To: 'kliteyn at dev.mellanox.co.il'; John Jolly
> Cc: ewg at lists.openfabrics.org; Linux RDMA; Todd Rimmer; Eli Dorfman
> (Voltaire)
> Subject: RE: Patch breaks OFED 1.5.3: [ewg] [PATCH] ibdiagpath:
> Properly index VlArbTable during QoS test
>
> Yevgeny,
>
> It did occur to me that this is a version issue; I tested with TCL 8.4,
> which is the version included in RHEL5 and SLES10. The newest version
> appears to be 8.5, skimming through the release notes I didn't see
> anything about languages changes, but if it's working for you then
> obviously the language has been changed.
>
> The thing is, I also noticed that John's original complaint - about an
> extra item in the array - did not seem to be true on the RHEL 5.x boxes
> I tried, which is why I suggested that the entire change should be
> rolled back.
>
> I'm building RC4 on a Red Hat 6 box now, I'll see if it makes a
> difference.
>
> -----Original Message-----
> From: Yevgeny Kliteynik [mailto:kliteyn at dev.mellanox.co.il]
> Sent: Sunday, February 20, 2011 9:05 AM
> To: Mike Heinz; John Jolly
> Cc: ewg at lists.openfabrics.org; Linux RDMA; Todd Rimmer; Eli Dorfman
> (Voltaire)
> Subject: Re: Patch breaks OFED 1.5.3: [ewg] [PATCH] ibdiagpath:
> Properly index VlArbTable during QoS test
>
> Mike,
>
> This looks like a different tcl versions/implementation issue.
>
> I certainly can replace "$i+1" with "[expr $i+1]", but I'm not
> sure about reverting the patch.
>
> John,
>
> What tcl version have you used?
>
> -- YK
>
>
>
> On 07-Feb-11 6:44 PM, Mike Heinz wrote:
> > The version of  ibdiagpath included with OFED 1.5.3-rc3 contains
> syntax errors which prevent it from executing on the systems I've
> tested (using TCL 8.4).  Attempts to use ibdiagpath fail with an error
> message:
> >
> >> -I---------------------------------------------------
> >> -I- QoS on Path Check
> >> -I---------------------------------------------------
> >> bad index "0+1": must be integer or end?-integer?
> >
> > After doing some research and debugging, I traced the problem to a
> patch applied back in October:
> >
> > commit f3cf1f7c15ca24598fdf68b9ba71788b386b2f14
> > Author: John Jolly<jjolly at novell.com>
> > Date:   Wed Oct 6 17:29:48 2010 +0200
> >
> >      ibdiagpath: Properly index VlArbTable during QoS test
> >
> >      Description: ibdiagpath: Properly index VlArbTable during QoS
> test
> >      Symptom:     Error 'invalid bareword "vl_entry"' during "QoS on
> >                   Path Check"
> >      Problem:     The 'dump' command within the smVlArbTableMad
> command
> >                   appends '-vl_entry' to the beginning of the array.
> >                   The ibdebug.tcl script does not properly handle
> this
> >                   extra element at the beginning of the array.
> >      Solution:    Offset the index value by one when referencing the
> >                   array.
> >
> >      Signed-off-by: John Jolly<jjolly at novell.com>
> >      Signed-off-by: Yevgeny Kliteynik<kliteyn at dev.mellanox.co.il>
> >
> > Unfortunately, this patch isn't valid TCL code (at least not in TCL
> 8.4) and does not appear to be needed at all.
> >
> > For example:
> >
> >> set entry [lindex $values $i+1]
> >
> > Is not syntactically correct TCL.  In order for it to be correct it
> would have to be
> >
> >> set entry [lindex $values [expr $i+1]]
> >
> > However, the patch does not appear to be needed at all. Reverting the
> patch, allows ibdiagpath to complete successfully:
> >
> >> -I---------------------------------------------------
> >> -I- QoS on Path Check
> >> -I---------------------------------------------------
> >> -W- Blocked VLs:3 4 5 at node:homer lid=0x0002
> guid=0x00066a00a000707f dev=25208>  port:1
> >> -W- SLs:3 4 5 6 7 8 9 10 11 12 13 14 15 are blocked due to VLArb
> node:homer
> >>      lid=0x0002 guid=0x00066a00a000707f dev=25208 in-port:0 out-
> port:1
> >> -W- Blocked VLs:3 4 5 at node: lid=0x0001 guid=0x00066a00d9000275
> dev=47396
> >>      port:21
> >> -W- SLs:3 4 5 6 7 8 9 10 11 12 13 14 15 mapped to VL>  5 at node:
> lid=0x0001
> >>      guid=0x00066a00d9000275 dev=47396 in-port:14 out-port:21
> >> -I- The following SLs can be used:0 1 2
> >
> > This message and any attached documents contain information from
> QLogic Corporation or its wholly-owned subsidiaries that may be
> confidential. If you are not the intended recipient, you may not read,
> copy, distribute, or use this information. If you have received this
> transmission in error, please notify the sender immediately by reply e-
> mail and then delete this message.
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> in
> > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
>
>
>
> This message and any attached documents contain information from QLogic
> Corporation or its wholly-owned subsidiaries that may be confidential.
> If you are not the intended recipient, you may not read, copy,
> distribute, or use this information. If you have received this
> transmission in error, please notify the sender immediately by reply e-
> mail and then delete this message.
>
> _______________________________________________
> ewg mailing list
> ewg at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.




More information about the ewg mailing list