watchdog (一)
为什么有此记录?
A:关闭了watchdog,设定panic_timeout,X时系统竟然重启了。因此,自己看下watchdog。version:3.14, smp
Read lockup-watchdog.txt
softlockup:程序连续持有某CPU资源超过20s watchdog_thresh(仍可响应中断),导致其他任务无法获得该CPU资源。
hardlockup:程序连续持有某CPU资源超过10s 1.2watchdog_thresh,(其它)中断无法获得该CPU资源。
如果内核被配置为在这两种情形下调用panic函数,则在设置panic_timeout>0的情形下,系统将重启。
softlockup & hardlockup构建于hrtimer & perf子系统之上(hrtimer & perf是需要硬件支持的)。
一个周期性的hrtimer(.function = watchdog_timer_fn, ._softexpires=watchdog_thresh * 2 * NSEC_PER_SEC / 5)在中断可被响应的情形下,必将得到执行。在执行watchdog_timer_fn时,hrtimer_interrupts[cpu]++。
An NMI perf event is generated every "watchdog_thresh" seconds to check for hardlockups. If any CPU in the system does not receive any hrtimer interrupt during that time, the "hardlockup dectector" will generate a kernel warning or call panic.
(附:此处可以看出_softexpires = watchdog_thresh * 2 / 5。故0.4个watchdog_thresh时间,如果hrtimer被响应,则hrtimer_interrupts[cpu]++,这样可以避免误检hardlockup情形。思想就是增加采样率,每3次才检测)
The watchdog task is a high priority kernel thread that updates a timerstamp every time it is scheduled. If that timestamp is not updated for 2*watchdog_thresh seconds, the 'softlockup detector' will dump useful info to system log, then panic or exec other code.
panic函数的最后调用了
- panic_timeout > 0: touch_nmi_watchdog()
- panic_timeout = 0: touch_softlockup_watchdog()
touch_softlockup_watchdog
|--->__this_cpu_write(watchdog_touch_ts, 0);
watchdog_touch_ts 是一个时间戳,每次hrimter发生时,均会更新watchdog_touch_ts。假设一种情形:watchdog_touch_ts被更新后,由于种种原因,hrtimer事件在相当长一段时间内未被执行,而后突然又被执行了,结果在这次突如起来的执行watchdog_timer_fn过程中,is_softlockup()函数发现当前时间戳 > watchdog_touch_ts + 2 * watchdog_thresh,故可以断定发生了softlockup。但是,有时,一个任务确实需要长时间持有CPU资源,于是我们需要一种特殊标记: 把watchdog_touch_fs置为0,在该次执行watchdog_timer_fn过程中,不检测是否发生softlockup。
touch_nmi_watchdog
|---->per_cpu(watchdog_nmi_touch,cpu) = true;
|---->touch_softlockup_watchdog();
此处把watchdog_nmi_touch[cpu]设为true,如同watchdog_touch_ts[cpu]=0,放弃执行watchdog_check_hardlockup_other_cpu过程中检测hardlockup.
流程:
smpboot_register_percpu_thread |-->__smpboot_create_thread(plug_thread, cpu);
__smpboot_create_thread(smp_hotplug_thread *ht, unsigned int cpu) |-->struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu); |-->struct smpboot_thread_data *td; (kzalloc_node) | td->cpu = cpu; td->ht = ht; |-->tsk = kthread_create_on_cpu(smpboot_thread_fn, td, cpu, ht->thread_comm); |-->*per_cpu_ptr(ht->store, cpu) = tsk; |-->ht->create ? ht->create(cpu) : 0; smpboot_thread_fn(void *data) |-->struct smpboot_thread_data *td = td; |-->struct smp_hotplug_thread *ht = td->ht; |-->while(1) { |--> ... |--> ht->setup(td->cpu); |--> .... |--> ht->unpark(td->cpu) ... |--> ... |--> ht->thread_should_run(td->cpu) ... |--> ... |--> ht->thread_fn(td->cpu); |-->}
start_kernel |-->rest_init(); rest_init |-->kernel_thread(kernel_init, NULL, CLONE_FS|CLONE_SIGHAND); kernel_init |-->kernel_init_freeable(); |-->lockup_detector_init(); |-->watchdog_enable_all_cpus(false); watchdog_enable_all_cpus |-->smpboot_register_percpu_thread(&watchdog_threads);
watchdog_threads = { .store = &softlockup_watchdog, .thread_should_run = watchdog_should_run, .thread_fn = watchdog, .thead_comm = "watchdog/%u", .setup = watchdog_enable, ==> 建立hrtimer .cleanup = watchdog_cleanup, .park = watchdog_disable, .unpark = watchdog_enable, } void __touch_watchdog(void) |-->watchdog_touch_fs[cpu] = get_timestamp(); void watchdog(unsigned int cpu) |-->soft_lockup_hrtimer_cnt[cpu] = hrtimer_interrupts[cpu]; int watchdog_should_run(unsinged int cpu) |--> return hrtimer_interrupts[cpu] != soft_lockup_hrtimer_cnt; void wachdog_check_hardlockup_other_cpu() |-->if (hrtimer_interrupts[cpu] % 3 != 0) return; |-->next_cpu = watchdog_next_cpu(smp_processor_id()); 为什么是next_cpu?? |-->if (per_cpu(watchdog_nmi_touch, next_cpu) == true) { | per_cpu(watchdog_nmi_touch, next_cpu) = false; return; | } |-->if (is_hardlockup_other_cpu(next_cpu)) { //检测hrtimer_interrupts[cpu]在一个watchdog_thresh时间内是否改变 | if (per_cpu(hard_watchdog_warn, next_cpu) == true) return; | ...处理 hardlockup | } |-->else per_cpu(hard_watchdog_warn, next_cpu) = false; int is_softlockup(unsigned long touch_fs) |--> now = get_timestamp(); |--> if touch_fs+2*watchdog_thresh < now ==> return now - tou - 2*wt |--> return 0; watchdog_timer_fn(struct hrtimer *hrtimer) |-->hrtimer_interrupts[cpu]++; |-->watchdog_check_hardlockup_other_cpu(); |-->构建下一个hrtimer触发时间点 |-->if (touch_ts == 0) ..特殊情形 |-->duration = is_softlockup(touch_ts); |-->if(duration)..处理softlockup
疑问:
- An NMI perf event is generated every "watchdog_thresh" seconds to check for hardlockups. ????
- 这只是检测到了softlockup和hardlockup,可以将系统配置成此时触发panic,但是panic_timeout=0,系统又怎会重启呢?
- 内核如何喂狗?NMI-->RESET过程?
浙公网安备 33010602011771号