***SPAM*** Re: [ofa-general] hca sma non-responsive but link still Active

Hal Rosenstock hal.rosenstock at gmail.com
Thu Dec 4 10:22:16 PST 2008


On Thu, Dec 4, 2008 at 12:06 PM, Chas Williams (CONTRACTOR)
<chas at cmf.nrl.navy.mil> wrote:
> if i load the attached module on my host, the link winds up in a curious
> state.  the intent of the module is to duplicate a particular type of
> kernel hang that blocks all the cpus from handling any work.
>
> what happens is that the sma stops responding:
>
>        # ibportstate  90 1
>        ibportstate: iberror: failed: smp query nodeinfo failed
>
> but the switch port on the other end of the link still reports a valid
> state:
>
>        # ibportstate  70 18
>        PortInfo:
>        # Port info: Lid 70 port 18
>        LinkState:.......................Active
>        PhysLinkState:...................LinkUp
>        LinkWidthSupported:..............1X or 4X
>        LinkWidthEnabled:................1X or 4X
>        LinkWidthActive:.................4X
>        LinkSpeedSupported:..............2.5 Gbps or 5.0 Gbps
>        LinkSpeedEnabled:................2.5 Gbps
>        LinkSpeedActive:.................2.5 Gbps
>        ibwarn: [6758] _do_madrpc: recv failed: Connection timed out
>        ibportstate: iberror: failed: smp query nodeinfo failed
>
> we believe that the link layer is handled entirely in the firmware

Mostly but some control is from the host (e.g. in terms of setting
port physical state) being passed down by the host SMA (at least in
terms of Linux kernel on Mellanox HCAs).

> which has no idea that the sma part in the kernel has gone to sleep.

Right; the part of the SMA in firmware is mainly passive and requires
the host interaction but does not detect it's mis or non behavior.

> the periodic light sweeps by the opensm dont seem to discover this
> problem either.

Light sweep only polls SwitchInfo looking to see if there is more to
be done. If SwitchInfo doesn't indicate some port state change (which
it doesn't for this case), then it won't see this.

> this type of failure tends to make the ib utilities that scan the network
> run rather slowly.  ibdiagnet does indeed spot this broken host, but
> perhaps the sm could be extended to attempt to something about this
> host, like reset the switch port?

IMO the best approach would be for the firmware to drop the link when
the host becomes non responsive (and only allow it to come back when
the host is responsive) rather than putting additional policy
(detection/reaction/etc) into OpenSM.

>  should it really require manual intervention to clear this error?

Ideally no but this node is violating its "contract" in that if the
physical state is INIT or beyond, it's required to respond to SMA
packets.

-- Hal

> /* doom.c -- reliably wedge an smp kernel
>  *
>  * build:
>  *        echo 'obj-m   += doom.o' > Makefile
>  *        make -C /lib/modules/`uname -r`/build M=`pwd`
>  *
>  * usage:
>  *        insmod doom.ko
>  */
>
> #include <linux/module.h>
> #include <linux/kernel.h>
> #include <linux/init.h>
> #include <linux/spinlock.h>
> #include <linux/smp.h>
>
> static void wedge(void *data)
> {
>        unsigned long flags;
>        spinlock_t lock;
>
>        printk(KERN_ERR "goodbye cruel world...\n");
>
>        spin_lock_init(&lock);
>        spin_lock_irqsave(&lock, flags);
>
>        while (1)
>                /* do nothing */;
> }
>
> static int __init doom_init(void)
> {
>        int i;
>
>        for_each_possible_cpu(i) {
>                if (i != smp_processor_id())
>                        smp_call_function_single(i, wedge, 0, 0, 0);
>        }
>
>        smp_call_function_single(smp_processor_id(), wedge, 0, 0, 0);
>
>        return 0;
> }
>
> module_init(doom_init);
>
> MODULE_AUTHOR("chas williams <chas at cmf.nrl.navy.mil>");
> MODULE_DESCRIPTION("wedge the kernel but good");
> MODULE_LICENSE("GPL");
> _______________________________________________
> general mailing list
> general at lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
>
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
>



More information about the general mailing list