BUG: scheduling while atomic 分析Linux内核

✅作者简介:嵌入式领域优质创作者,博客专家
✨个人主页:咸鱼弟
🔥系列专栏:Linux专栏 

遇到该问题的背景是在修改一个模块时,写了一个定时器,在定时器中断处理函数中处理了一些复杂而又耗时的任务,导致出现的该BUG,从而导致系统重启。

见如下打印

<3>[26578.636839] C1 [      swapper/1] BUG: scheduling while atomic: swapper/1/0/0x00000002
<6>[26578.636869] C0 [    kworker/u:1] CPU1 is up
<4>[26578.636900] C1 [      swapper/1] Modules linked in: bcm15500_i2c_ts
<4>[26578.636961] C1 [      swapper/1] [<c00146d0>] (unwind_backtrace+0x0/0x11c) from [<c0602684>] (__schedule+0x70/0x6e0)
<4>[26578.636991] C1 [      swapper/1] [<c0602684>] (__schedule+0x70/0x6e0) from [<c06030ec>] (schedule_preempt_disabled+0x14/0x20)
<4>[26578.637052] C1 [      swapper/1] [<c06030ec>] (schedule_preempt_disabled+0x14/0x20) from [<c000f05c>] (cpu_idle+0xf0/0x104)
<4>[26578.637083] C1 [      swapper/1] [<c000f05c>] (cpu_idle+0xf0/0x104) from [<c05e98e0>] (cpu_die+0x2c/0x5c)
<3>[26578.637510] C1 [      swapper/1] BUG: scheduling while atomic: swapper/1/0/0x00000002
<4>[26578.637510] C1 [      swapper/1] Modules linked in: bcm15500_i2c_ts
<4>[26578.637602] C1 [      swapper/1] [<c00146d0>] (unwind_backtrace+0x0/0x11c) from [<c0602684>] (__schedule+0x70/0x6e0)
<4>[26578.637663] C1 [      swapper/1] [<c0602684>] (__schedule+0x70/0x6e0) from [<c06030ec>] (schedule_preempt_disabled+0x14/0x20)
<4>[26578.637724] C1 [      swapper/1] [<c06030ec>] (schedule_preempt_disabled+0x14/0x20) from [<c000f05c>] (cpu_idle+0xf0/0x104)
<4>[26578.637754] C1 [      swapper/1] [<c000f05c>] (cpu_idle+0xf0/0x104) from [<c05e98e0>] (cpu_die+0x2c/0x5c)
<3>[26578.648069] C1 [      swapper/1] BUG: scheduling while atomic: swapper/1/0/0x00000002

 该打印是在如下代码中

/*
 * Print scheduling while atomic bug:
 */
static noinline void __schedule_bug(struct task_struct *prev)
{
	if (oops_in_progress)
		return;
 
	printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
		prev->comm, prev->pid, preempt_count());
 
	debug_show_held_locks(prev);
	print_modules();
	if (irqs_disabled())
		print_irqtrace_events(prev);
 
	dump_stack();
}
 
/*
 * Various schedule()-time debugging checks and statistics:
 */
static inline void schedule_debug(struct task_struct *prev)
{
	/*
	 * Test if we are atomic. Since do_exit() needs to call into
	 * schedule() atomically, we ignore that path for now.
	 * Otherwise, whine if we are scheduling when we should not be.
	 */
	if (unlikely(in_atomic_preempt_off() && !prev->exit_state))
		__schedule_bug(prev);
	rcu_sleep_check();
 
	profile_hit(SCHED_PROFILING, __builtin_return_address(0));
 
	schedstat_inc(this_rq(), sched_count);
}

再上一级函数

/*
 * __schedule() is the main scheduler function.
 */
static void __sched __schedule(void)
{
	struct task_struct *prev, *next;
	unsigned long *switch_count;
	struct rq *rq;
	int cpu;
 
need_resched:
	preempt_disable();
	cpu = smp_processor_id();
	rq = cpu_rq(cpu);
	rcu_note_context_switch(cpu);
	prev = rq->curr;
 
	schedule_debug(prev);
    ....
}

可以看出, 满足如下条件将会打印该出错信息

unlikely(in_atomic_preempt_off() && !prev->exit_state

为0表示TASK_RUNNING状态,当前进程在运行; 并且处于原子状态,,那么就不能切换给其它的进程 

Linux/include/linux/sched.h 
 
/*
 * Task state bitmask. NOTE! These bits are also
 * encoded in fs/proc/array.c: get_task_state().
 *
 * We have two separate sets of flags: task->state
 * is about runnability, while task->exit_state are
 * about the task exiting. Confusing, but this way
 * modifying one set can't modify the other one by
 * mistake.
 */
#define TASK_RUNNING		0
#define TASK_INTERRUPTIBLE	1
#define TASK_UNINTERRUPTIBLE	2
#define __TASK_STOPPED		4
#define __TASK_TRACED		8
/* in tsk->exit_state */
#define EXIT_ZOMBIE		16
#define EXIT_DEAD		32
/* in tsk->state again */
#define TASK_DEAD		64
#define TASK_WAKEKILL		128
#define TASK_WAKING		256
#define TASK_STATE_MAX		512

kernel/include/linux/hardirq.h
 
#if defined(CONFIG_PREEMPT_COUNT)
# define PREEMPT_CHECK_OFFSET 1
#else
# define PREEMPT_CHECK_OFFSET 0
#endif
 
/*
 * Are we running in atomic context?  WARNING: this macro cannot
 * always detect atomic context; in particular, it cannot know about
 * held spinlocks in non-preemptible kernels.  Thus it should not be
 * used in the general case to determine whether sleeping is possible.
 * Do not use in_atomic() in driver code.
 */
#define in_atomic()	((preempt_count() & ~PREEMPT_ACTIVE) != 0)
 
/*
 * Check whether we were atomic before we did preempt_disable():
 * (used by the scheduler, *after* releasing the kernel lock)
 */
#define in_atomic_preempt_off() \
		((preempt_count() & ~PREEMPT_ACTIVE) != PREEMPT_CHECK_OFFSET)

通过上述分析我们得出结论:

linux内核打印"BUG: scheduling while atomic"和"bad: scheduling from the idle thread"错误的时候,通常是在中断处理函数中调用了可以休眠的函数,如semaphore,mutex,sleep之类的可休眠的函数,

我在定时器中断到来时,处理函数中有sleep的操作,而linux内核要求在中断处理的时候,不允许系统调度,不允许抢占,要等到中断处理完成才能做其他事情,从而导致该bug。因此,要充分考虑中断处理的时间,一定不能太久。

那遇到这种问题是如何解决的?

可以使用工作队列,将这些复杂耗时的任务放到工作队列中去处理,也就是当中断到来时,把原来直接处理的复杂任务放到工作队列中去处理。

工作队列的使用可参考 Linux内核工作队列(workqueue)详解

  • 0
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
Take container cluster management to the next level; learn how to administer and configure Kubernetes on CoreOS; and apply suitable management design patterns such as Configmaps, Autoscaling, elastic resource usage, and high availability. Some of the other features discussed are logging, scheduling, rolling updates, volumes, service types, and multiple cloud provider zones. The atomic unit of modular container service in Kubernetes is a Pod, which is a group of containers with a common filesystem and networking. The Kubernetes Pod abstraction enables design patterns for containerized applications similar to object-oriented design patterns. Containers provide some of the same benefits as software objects such as modularity or packaging, abstraction, and reuse. CoreOS Linux is used in the majority of the chapters and other platforms discussed are CentOS with OpenShift, Debian 8 (jessie) on AWS, and Debian 7 for Google Container Engine. CoreOS is the main focus becayse Docker is pre-installed on CoreOS out-of-the-box. CoreOS: Supports most cloud providers (including Amazon AWS EC2 and Google Cloud Platform) and virtualization platforms (such as VMWare and VirtualBox) Provides Cloud-Config for declaratively configuring for OS items such as network configuration (flannel), storage (etcd), and user accounts Provides a production-level infrastructure for containerized applications including automation, security, and scalability Leads the drive for container industry standards and founded appc Provides the most advanced container registry, Quay Docker was made available as open source in March 2013 and has become the most commonly used containerization platform. Kubernetes was open-sourced in June 2014 and has become the most widely used container cluster manager. The first stable version of CoreOS Linux was made available in July 2014 and since has become one of the most commonly used operating system for containers. What You'll Learn Use Kubernetes with Docker Create a Kubernetes cluster on CoreOS on AWS Apply cluster management design patterns Use multiple cloud provider zones Work with Kubernetes and tools like Ansible Discover the Kubernetes-based PaaS platform OpenShift Create a high availability website Build a high availability Kubernetes master cluster Use volumes, configmaps, services, autoscaling, and rolling updates Manage compute resources Configure logging and scheduling Who This Book Is For Linux admins, CoreOS admins, application developers, and container as a service (CAAS) developers. Some pre-requisite knowledge of Linux and Docker is required. Introductory knowledge of Kubernetes is required such as creating a cluster, creating a Pod, creating a service, and creating and scaling a replication controller. For introductory Docker and Kubernetes information, refer to Pro Docker (Apress) and Kubernetes Microservices with Docker (Apress). Some pre-requisite knowledge about using Amazon Web Services (AWS) EC2, CloudFormation, and VPC is also required.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

咸鱼弟

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值