[Users] ibacm preventing sysrq poweroff in OFED 3.18

Marcus Epperson marcus.r.epperson at gmail.com
Tue Feb 23 10:10:24 PST 2016


We didn't change anything in librdmacm, we just have some in-house code
that uses rdmacm and it wasn't handling RDMA_CM_EVENT_DEVICE_REMOVAL
previously. We now handle it by exiting immediately.

-Marcus

On Tue, Feb 23, 2016 at 6:45 AM, Hal Rosenstock <hal.rosenstock at gmail.com>
wrote:

> Hi Marcus,
>
> ACM already has some uverbs local event handling but does not currently
> handle IBV_EVENT_DEVICE_FATAL. I think that event would need to be handled
> in acm_event_handler.
>
> In terms of RDMA_CM_EVENT_DEVICE_REMOVAL, what was changed in librdmacm ?
> Was this fed upstream ? Within last week, there is relevant discussion
> thread on this event.
>
> -- Hal
>
> On Mon, Feb 22, 2016 at 9:09 PM, Marcus Epperson <
> marcus.r.epperson at gmail.com> wrote:
>
>> It looks like a change to the mlx4 driver between OFED 3.12 and 3.18
>> broke sysrq poweroff for us, but only when ibacm is running. Here's what I
>> see when I issue sysrq 'o':
>>
>> SysRq : Power Off
>> igb 0000:02:00.1: PCI INT B disabled
>> igb 0000:02:00.0: PCI INT A disabled
>> INFO: task events/0:67 blocked for more than 120 seconds.
>>       Tainted: P           ---------------    2.6.32-504.16.2.el6.x86_64
>> #1
>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> events/0      D 0000000000000000     0    67      2 0x00000000
>>  ffff881029697b40 0000000000000046 0000000000000000 ffff881000000000
>>  ffff881029697a90 ffff881029697a90 0000f1546265231e ffff881029697ae0
>>  ffff88100af7e520 000000010fcaaf92 ffff881029695ad8 ffff881029697fd8
>> Call Trace:
>>  [<ffffffff8152ad25>] schedule_timeout+0x215/0x2e0
>>  [<ffffffff8152a9a3>] wait_for_common+0x123/0x180
>>  [<ffffffff81064bc0>] ? default_wake_function+0x0/0x20
>>  [<ffffffff8152aabd>] wait_for_completion+0x1d/0x20
>>  [<ffffffffa037b0b3>] ib_uverbs_remove_one+0x73/0xa0 [ib_uverbs]
>>  [<ffffffffa0080e5f>] ib_unregister_device+0x4f/0x100 [ib_core]
>>  [<ffffffffa01354e1>] mlx4_ib_remove+0x41/0x270 [mlx4_ib]
>>  [<ffffffffa00ef5d1>] mlx4_remove_device+0x71/0x90 [mlx4_core]
>>  [<ffffffffa00ef633>] mlx4_unregister_device+0x43/0x90 [mlx4_core]
>>  [<ffffffffa00f17ff>] mlx4_unload_one+0xdf/0x390 [mlx4_core]
>>  [<ffffffff810c7e50>] ? do_poweroff+0x0/0x10
>>  [<ffffffff812af4bc>] pci_device_shutdown+0x2c/0x50
>>  [<ffffffff81367dfb>] device_shutdown+0x4b/0x120
>>  [<ffffffff81095eb6>] kernel_shutdown_prepare+0x36/0x40
>>  [<ffffffff81095ed3>] kernel_power_off+0x13/0x50
>>  [<ffffffff810c7e5e>] do_poweroff+0xe/0x10
>>  [<ffffffff81098090>] worker_thread+0x170/0x2a0
>>  [<ffffffff8109ebb0>] ? autoremove_wake_function+0x0/0x40
>>  [<ffffffff81097f20>] ? worker_thread+0x0/0x2a0
>>  [<ffffffff8109e71e>] kthread+0x9e/0xc0
>>  [<ffffffff8100c20a>] child_rip+0xa/0x20
>>  [<ffffffff8109e680>] ? kthread+0x0/0xc0
>>  [<ffffffff8100c200>] ? child_rip+0x0/0x20
>>
>> The node will stay powered on indefinitely at this point, unless I kill
>> ibacm. Once ibacm is gone, I see the expected "Power down." message and the
>> node powers off right away.
>>
>> Some of our own processes were also preventing the poweroff from
>> happening. We had to add handing for RDMA_CM_EVENT_DEVICE_REMOVAL to get
>> around it -- we just exit immediately when we see that. I don't know how to
>> do the equivalent for ibacm though.
>>
>> In general it seems wrong that a "power off right now" mechanism should
>> be defeated by userspace processes like this. I thought about removing the
>> 'shutdown' hook in our mlx4 driver, as show below, going back to how it was
>> in OFED 3.12. Is there any downside to that, other than having to maintain
>> my own patch forever?
>>
>> Thanks for any input,
>> -Marcus
>>
>>
>> --- compat-rdma-3.18-1/drivers/net/ethernet/mellanox/mlx4/main.c.orig
>> 2015-11-24 01:15:01.000000000 -0800
>> +++ compat-rdma-3.18-1/drivers/net/ethernet/mellanox/mlx4/main.c
>> 2015-11-24 01:15:01.000000000 -0800
>>
>> @@ -2951,7 +2539,6 @@ static struct pci_driver mlx4_driver = {
>>     .name       = DRV_NAME,
>>     .id_table   = mlx4_pci_table,
>>     .probe      = mlx4_init_one,
>> -   .shutdown   = mlx4_unload_one,
>>     .remove     = mlx4_remove_one,
>>     .err_handler    = &mlx4_err_handler,
>>  };
>>
>> _______________________________________________
>> Users mailing list
>> Users at lists.openfabrics.org
>> http://lists.openfabrics.org/mailman/listinfo/users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openfabrics.org/pipermail/users/attachments/20160223/9d459771/attachment.html>


More information about the Users mailing list