[ofa-general] Re: OFED 1.3 hang in cm_destroy_id()
Or Gerlitz
ogerlitz at voltaire.com
Wed Aug 6 07:01:46 PDT 2008
akepner at sgi.com wrote:
> I've gotten a report of a hang very similar to one reported in:
> http://lists.openfabrics.org/pipermail/general/2008-June/052275.html
>
> Here's the backtrace of the hung ipoib task:
>
> STACK TRACE FOR TASK: 0xe00003600b070000 (ipoib)
>
> 0 schedule+0x26ec [0xa0000001005a12ac]
> 1 wait_for_completion+0x14c [0xa0000001005a198c]
> 2 cm_destroy_id+0x66c [0xa00000021531e72c]
> 3 ib_destroy_cm_id+0x2c [0xa0000002153210cc]
> 4 ipoib_cm_tx_reap+0x17c [0xa000000215719abc]
> 5 run_workqueue+0x1dc [0xa0000001000c7f1c]
> 6 worker_thread+0x1bc [0xa0000001000c963c]
> 7 kthread+0x23c [0xa0000001000d39dc]
> 8 kernel_thread_helper+0xcc [0xa0000001000133ec]
> 9 start_kernel_thread+0x1c [0xa0000001000094bc]
>
>
> The cm_id->state is IB_CM_TIMEWAIT, and the refcount is 1.
>
> This is an ia64 system with an MT23108, running OFED 1.3.
>
> Haven't yet been able to reproduce this, however.
>
Hi Arthur,
As of the large set of patches which found their way into ofed 1.3
without passing through kernel acceptance and the lack of support for
ofed by the Linux IB maintainer, I truly believe that the most
constructive approach to debug ipoib issues is to try and reproduce the
bug on the mainline kernel code and work with the mainstream kernel IB
maintainer.
Or.
More information about the general
mailing list