linux定时器迁移分析
本来想写一些关于linux定时器得源码分析,但是发现一篇博客写得很详细,就放个链接在这里方便查找。
linux 内核 定时器(timer)实现机制_linux内核定时器-CSDN博客
这里补充一些内容。定时器在mod_timer中是可能发生迁移的,从一个cpu迁移到另外一个cpu。目标cpu的选择是通过get_target_base。
static inline struct timer_base * get_target_base(struct timer_base *base, unsigned tflags) { #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) if (static_branch_likely(&timers_migration_enabled) && !(tflags & TIMER_PINNED)) return get_timer_cpu_base(tflags, get_nohz_timer_target()); #endif return get_timer_this_cpu_base(tflags); }
在开启NO_HZ_COMMON,使能timer migration且timer没有pin住的情况下调用get_timer_cpu_base获取新的timer base,这里的关键是get_nozh_timer_target,它会去选择cpu。
/*
* In the semi idle case, use the nearest busy CPU for migrating timers
* from an idle CPU. This is good for power-savings.
*
* We don't do similar optimization for completely idle system, as
* selecting an idle CPU will add more delays to the timers than intended
* (as that CPU's timer base may not be uptodate wrt jiffies etc).
*/
int get_nohz_timer_target(void) { int i, cpu = smp_processor_id(); struct sched_domain *sd; // 如果当前cpu是busy cpu,直接返回 if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER)) return cpu; rcu_read_lock(); for_each_domain(cpu, sd) { // 沿着调度域查找busy cpu for_each_cpu(i, sched_domain_span(sd)) { if (cpu == i) continue; if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) { cpu = i; goto unlock; } } } if (!housekeeping_cpu(cpu, HK_FLAG_TIMER)) cpu = housekeeping_any_cpu(HK_FLAG_TIMER); unlock: rcu_read_unlock(); return cpu; }
这其实就是在当前cpu附近找一个busy的cpu,这是为了省电。如果timer都汇聚到某个busy cpu上,可能导致timer base竞争加剧,因为在拿到timer base之后会使用spin lock锁住,导致性能下降。
如果想要关闭timer迁移,可以设置nohz=off,更方便的是echo 0 > /proc/sys/kernel/timer_migration
一个问题,如果阻止定时器迁移,当前的cpu是nohz状态,由于定时器依赖于tick,此时会不会出现定时器超期没有处理的情况。
内核已经考虑到了这种情形,在stop tick会设置根据timer设置下一次的tick。也就是说,如果当前cpu处于nohz状态,它也是可能有tick到来的。相关的代码位于tick_nohz_next_event。
static ktime_t tick_nohz_next_event(struct tick_sched *ts, int cpu) { u64 basemono, next_tick, next_tmr, next_rcu, delta, expires; unsigned long seq, basejiff; /* * Keep the periodic tick, when RCU, architecture or irq_work * requests it. * Aside of that check whether the local timer softirq is * pending. If so its a bad idea to call get_next_timer_interrupt() * because there is an already expired timer, so it will request * immeditate expiry, which rearms the hardware timer with a * minimal delta which brings us back to this place * immediately. Lather, rinse and repeat... */ if (rcu_needs_cpu(basemono, &next_rcu) || arch_needs_cpu() || irq_work_needs_cpu() || local_timer_softirq_pending()) { next_tick = basemono + TICK_NSEC; } else { /* * Get the next pending timer. If high resolution * timers are enabled this only takes the timer wheel * timers into account. If high resolution timers are * disabled this also looks at the next expiring * hrtimer. */ next_tmr = get_next_timer_interrupt(basejiff, basemono); ts->next_timer = next_tmr; /* Take the next rcu event into account */ next_tick = next_rcu < next_tmr ? next_rcu : next_tmr; }
还有一种办法可以阻止timer迁移,又不会因为nohz造成定时器超期。在linux中,软中断存在于两种上下文中。如果是由do_softirq来执行,那么就处在软中断上下文;另外,还可以通过softirqd线程来执行,这样就会处于线程上下文,该cpu就不会处于idle状态,从而避免定时器迁移,也会避免cpu进入nohz状态。设置软中断由softirqd来执行的路径有多个。如果在irq退出的时候执行irq_exit会调用invoke_softirq,它会判断是否是强制要求中断线程化,如果是,就会唤醒softirqd线程。
static inline void invoke_softirq(void) { if (ksoftirqd_running(local_softirq_pending())) return; if (!force_irqthreads) { #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK /* * We can safely execute softirq on the current stack if * it is the irq stack, because it should be near empty * at this stage. */ __do_softirq(); #else /* * Otherwise, irq_exit() is called on the task stack that can * be potentially deep already. So call softirq in its own stack * to prevent from any overrun. */ do_softirq_own_stack(); #endif } else { wakeup_softirqd(); } }
如果在kernel启动参数中加上threadirqs,那就可以设置force_irqthreads为true。
static int __init setup_forced_irqthreads(char *arg) { force_irqthreads = true; return 0; } early_param("threadirqs", setup_forced_irqthreads); #endif



浙公网安备 33010602011771号