[ofa-general] 2.6.16.46-0.12-SLERT-10-15: scheduling while atomic ?
Vincent Ficet
jean-vincent.ficet at bull.net
Wed Feb 11 08:21:13 PST 2009
Hello,
On a Suse real time kernel (2.6.16.46-0.12-SLERT-10-15), we get the
following kernel stack trace while running SDP traffic:
scheduling while atomic: ib_cm/4/0x00000001/18293
Call Trace:
<ffffffff80324eed>{__sched_text_start+125}
<ffffffff801375ca>{lock_timer_base+27}
<ffffffff80327a03>{_spin_unlock_irqrestore+53}
<ffffffff80137dc3>{__mod_timer+439}
<ffffffff80326870>{schedule_timeout+208}
<ffffffff80137fe5>{process_timeout+0}
<ffffffff80327a43>{_spin_unlock_irq+52}
<ffffffff80326232>{wait_for_completion_timeout+127}
<ffffffff80126fd4>{default_wake_function+0}
<ffffffff88515af6>{:mlx4_core:__mlx4_cmd+318}
<ffffffff8851c148>{:mlx4_core:mlx4_mr_free+73}
<ffffffff8852f0b8>{:mlx4_ib:mlx4_ib_dereg_mr+23}
<ffffffff884cfb9b>{:ib_core:ib_dereg_mr+26}
<ffffffff88633592>{:ib_sdp:sdp_destroy_qp+161}
<ffffffff88633c6d>{:ib_sdp:sdp_reset_sk+276}
<ffffffff88637f43>{:ib_sdp:sdp_cma_handler+2008}
<ffffffff885f9224>{:ib_cm:cm_work_handler+0}
<ffffffff886284f4>{:rdma_cm:cma_modify_qp_err+72}
<ffffffff80125876>{__wake_up_common+62}
<ffffffff80327a03>{_spin_unlock_irqrestore+53}
<ffffffff885f9224>{:ib_cm:cm_work_handler+0}
<ffffffff88629c25>{:rdma_cm:cma_ib_handler+369}
<ffffffff885f7e52>{:ib_cm:cm_process_work+26}
<ffffffff885f95fe>{:ib_cm:cm_work_handler+986}
<ffffffff885f9224>{:ib_cm:cm_work_handler+0}
<ffffffff8013f91e>{run_workqueue+154}
<ffffffff80324e76>{__sched_text_start+6}
<ffffffff8013ffb2>{worker_thread+0}
<ffffffff8014162a>{keventd_create_kthread+0}
<ffffffff801400ae>{worker_thread+252}
<ffffffff80126fd4>{default_wake_function+0}
<ffffffff8014162a>{keventd_create_kthread+0}
<ffffffff8014190a>{kthread+212}
<ffffffff8015865c>{hracct_exit_syscall+22}
<ffffffff8010bd5e>{child_rip+8}
<ffffffff8014162a>{keventd_create_kthread+0}
<ffffffff80141836>{kthread+0}
<ffffffff8010bd56>{child_rip+0}
The OFA kernel package in place is:
git://git.openfabrics.org/ofed_1_4/linux-2.6.git ofed_kernel
commit 88ab7955605c5e769e760f6bec980e0c2e72aa5c
Looking for the "scheduling while atomic" message in the latest kernel, we see that it was printed out by __schedule_bug in
this function:
/*
* Various schedule()-time debugging checks and statistics:
*/
static inline void schedule_debug(struct task_struct *prev)
{
/*
* Test if we are atomic. Since do_exit() needs to call into
* schedule() atomically, we ignore that path for now.
* Otherwise, whine if we are scheduling when we should not be.
*/
if (unlikely(in_atomic_preempt_off() && !prev->exit_state))
__schedule_bug(prev);
profile_hit(SCHED_PROFILING, __builtin_return_address(0));
schedstat_inc(this_rq(), sched_count);
#ifdef CONFIG_SCHEDSTATS
if (unlikely(prev->lock_depth >= 0)) {
schedstat_inc(this_rq(), bkl_count);
schedstat_inc(prev, sched_info.bkl_count);
}
#endif
}
Any idea as to what is going wrong here ?
Thanks for your help,
Vincent
More information about the general
mailing list