Linux 2.6.34.1源码下进程模型分析

1.进程概念

1.1进程源码

分析的源码来自：linux-2.6.34.1

进程管理的源码在：linux-2.6.34.1\include\linux\sched.h文件中

1.2进程的描述

进程是一个具有一定独立功能的程序关于某个数据集合的一次运行活动，是系统进行资源分配和调度的基本单位。
简单点来说，进程就是一段程序的执行过程。进程与程序的区别在于，几个进程可以并发的执行一个程序，一个进程也可以顺序的执行几个程序。

1.3进程的作用

进程包括程序代码段、数据、堆栈和进程控制块这几个部分。
进程是一种数据结构，能够清晰地刻画动态系统的内在规律，并且能够有效管理和调度进入计算机系统主存储器运行的程序。

1.4查看进程

　　　　　　　　　　（ps -A指令查看所有进程-在Ubuntu下）

2.进程相关数据结构

2.1进程管理

linux内核通过task_struct(进程描述符)结构体来管理进程

struct task_struct {
    volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */
    void *stack;
    atomic_t usage;
    unsigned int flags;    /* per process flags, defined below */
    unsigned int ptrace;

    int lock_depth;        /* BKL lock depth */
        ……

该数据结构可分为的部分很多，下面着重分析几个部分:
- 进程状态（State）
- 进程标识符（PID：process identifier）
- 进程调度信息（Scheduling Information）

2.1.1进程状态（State）

sched.h文件中，关于进程状态的注释如下，清楚的表示，进程状态分为两种，state（关于运行的状态）以及exit_state（关于退出的状态）

 1 /*
 2  * Task state bitmask. NOTE! These bits are also
 3  * encoded in fs/proc/array.c: get_task_state().
 4  *
 5  * We have two separate sets of flags: task->state
 6  * is about runnability, while task->exit_state are
 7  * about the task exiting. Confusing, but this way
 8  * modifying one set can't modify the other one by
 9  * mistake.
10  */

下面是state成员的可能取值：

 1 #define TASK_RUNNING        0
 2 #define TASK_INTERRUPTIBLE  1
 3 #define TASK_UNINTERRUPTIBLE    2
 4 #define __TASK_STOPPED      4
 5 #define __TASK_TRACED       8
 6 /* in tsk->exit_state */
 7 #define EXIT_ZOMBIE     16
 8 #define EXIT_DEAD       32
 9 /* in tsk->state again */
10 #define TASK_DEAD       64
11 #define TASK_WAKEKILL       128
12 #define TASK_WAKING     256
13 #define TASK_STATE_MAX      512

对应state的含义是：

TASK_RUNNING（可运行状态）：表示进程要么正在执行，要么正要准备执行。相比于其他系统定义运行状态为runnning状态，可执行但未被调度执行为ready状态，linux将两种状态统一为TASK_RUNNING状态。

TASK_INTERRUPTIBLE（可中断的等待状态）：表示进程被阻塞（睡眠），直到某个条件达成，进程的状态就被设置为TASK_RUNNING。处于该状态的进程正在等待某个事件（event）或某个资源，而被挂起。对应的task_struct结构被放入对应事件的等待队列中。处于可中断等待态的进程可以被信号（外部中断触发或者其他进程触发）唤醒，如果收到信号，该进程就从等待状态进入可运行状态，并且加入到运行队列中，等待被调度。
TASK_UNINTERRUPTIBLE（不可中断的等待状态）：TASK_UNINTERRUPTIBLE的意义与TASK_INTERRUPTIBLE类似，但是因为硬件环境不能满足而等待，例如等待特定的系统资源，它任何情况下都不能被打断，只能用特定的方式来唤醒它，例如唤醒函数wake_up（）等。
__TASK_STOPPED（暂停状态）：表示进此时的进程暂时停止运行来接受某种特殊处理。通常当进程接收到SIGSTOP、SIGTSTP、SIGTTIN或 SIGTTOU信号后就处于这种状态，收到SIGCONT信号后恢复成TASK_RUNNING状态。
__TASK_TRACED（跟踪状态）：表示进程被暂停下来，等待跟踪他的进程进行操作，如被debugger等进程监视。
EXIT_ZOMBIE（僵死状态）：表示进程的执行被终止，但是其父进程还没有使用wait()等系统调用来获知它的终止信息。退出的过程中进程占有的资源将被回收，除了少数资源如task_struct结构，只剩下一个空壳存在，所以称为僵尸。
EXIT_DEAD：表示进程的最终状态。

2.1.2进程状态转换图

　　　　　　　　　　　　　　　　　　 Linux进程状态转换图

进程是通过fork系列的系统调用来创建的，内核（或内核模块）也可以通过kernel_thread函数创建内核进程。
Linux的进程在几个状态间进行切换，就绪->执行，执行->停止等。
当请求的资源不能得到满足时，驱动一般会调度其他进程执行，并使本进程进入睡眠状态，直到它请求的资源被释放，才会被唤醒而进入就绪态。
睡眠分成可被打断的睡眠和不可被打断的睡眠，两者的区别在于可被打断的睡眠在收到信号的时候会醒。

2.1.3进程的组织

为了体现进程的创建，linux使用了父子关系，兄弟关系来表示。

struct task_struct *real_parent; /* real parent process */ 指向父进程
struct task_struct *parent; /* recipient of SIGCHLD, wait4() reports */指向养父进程
struct list_head children; /* list of my children */  该进程的子进程链表
struct list_head sibling;    /* linkage in my parent's children list */ 该进程的兄弟进程链表

为了区分进程和线程，使用了进程组来表示。

#define CLONE_THREAD    0x00010000    /* Same thread group? */ 创建一个新线程时，新线程的线程组长为当前进程或线程的线程组长。

为了快速查找某个进程，使用了哈希表。
为了进程调度，使用了运行队列、等待队列，将不同运行状态的进程放入不同的队列中。所有处于TASK_RUNNING状态的进程都会被放入CPU的运行队列，它们有可能在不同CPU的运行队列中。所有TASK_INTERRUPTIBLE和TASK_UNINTERRUPTIBLE都会被放入相应的等待队列。
进程的创建：以do_fork为基础（在 ./linux/kernel/sched.c 内找到）

long do_fork(unsigned long clone_flags,
          unsigned long stack_start,
          struct pt_regs *regs,
          unsigned long stack_size,
          int __user *parent_tidptr,
          int __user *child_tidptr)
{ ……

进程的销毁：以do_exit为基础（在 ./linux/kernel/exit.c 内）

NORET_TYPE void do_exit(long code)
{
    struct task_struct *tsk = current;
    int group_dead;

    profile_task_exit(tsk);

    WARN_ON(atomic_read(&tsk->fs_excl));

    if (unlikely(in_interrupt()))
        panic("Aiee, killing interrupt handler!");
    if (unlikely(!tsk->pid))
        panic("Attempted to kill the idle task!");

    tracehook_report_exit(&code);

    validate_creds_for_do_exit(tsk);
        ……

2.1.4进程的相关标识符（PID）

    pid_t pid;//内核中用以标识进程的id
    pid_t tgid;//用来实现线程机制

```
struct pid
{
    atomic_t count;
    unsigned int level;
    /* lists of tasks that use this pid */
    struct hlist_head tasks[PIDTYPE_MAX];
    struct rcu_head rcu;
    struct upid numbers[1];
};
```
* What is struct pid?
*
* A struct pid is the kernel's internal notion of a process identifier.
* It refers to individual tasks, process groups, and sessions. While
* there are processes attached to it the struct pid lives in a hash
* table, so it and then the processes that it refers to can be found
* quickly from the numeric pid value. The attached processes may be
* quickly accessed by following pointers from struct pid.
由于进程与进程描述符之间有严格的一对一的对应关系，使得每个进程都有一个唯一的标识符，内核通过这个标识符来识别不同的进程，同时，进程标识符PID也是内核提供给用户程序的接口，用户程序通过PID对进程发号施令。
PID是32位的无符号整数，它被顺序编号：新创建进程的PID通常是前一个进程的PID加1。然而，为了与16位硬件平台的传统Linux系统保持兼容，在Linux上允许的最大PID号是32767，当内核在系统中创建第32768个进程时，就必须重新开始使用已闲置的PID号。在64位系统中，PID可扩展到4194303。
在POSIX标准中规定了一个多线程应用程序中所有的线程都必须有相同的PID，在linux内核中，是使用轻量级进程实现线程的功能，但是轻量级进程也是一个进程，他们的PID都不相同，为了实现这一点，内核在进程描述符中引入了tgid字段。在linux的线程组概念中，一个线程组中所有线程使用的该线程组领头线程相同的PID，也就是该组第一个轻量级进程的PID，并保存到进程描述符的tgid字段中，如下图：

2.1.5进程调度

1.进程调度的含义

操作系统要实现多进程，进程调度必不可少。进程调度是对TASK_RUNNING状态的进程进行调度，需要进行调度的进程必须是可运行状态的。调度程序负责决定哪个进程投入运行，何时运行及运行多长时间。进程调度程序就是在可运行态进程之间分配有限的处理器时间资源的内核子系统。

2.进程调度的管理

操作系统需要一个管理单元，负责调度进程，由管理单元来决定下一刻应该由谁使用CPU，这里充当管理单元的就是进程调度器。进程调度器的任务就是合理分配CPU时间给运行的进程。下面介绍进程的分类，来更好的介绍进程调度是怎样的一种模式。

I/O-bound：指大部分的状况是CPU在等待I/O（硬盘/内存）的读/写，通常会花费很多时间等待I/O操作的完成，此时CPU Loading不高。
CPU-bound：计算密集型，与上一种密集型类似，此类型需要大量的CPU时间进行运算，对于对I/O的读/写在很短的时间内就能完成，大部分的状况是 CPU Loading 100%，例如一个计算圆周率至小数点一千位以下的程序，在执行的过程当中，绝大部份时间用在三角函数和开根号的计算，便是属于CPU-bound的程序。
交互式进程：顾名思义，该类型需要经常与用户进行交互，可能需要花费很长时间等待用户输入操作，且要求在很短的时间内做出响应，如图形应用程序，shell等。
批处理进程：此类型不需要与用户交互，通常是在后台运行完成，不要求即时响应，典型的例子有：程序的编译，科学计算等。
实时进程：响应的时间要短，有实时需求，不会被低优先级中断，例如视频或者音频的播放，机械的控制等。

综上几种类型的要求，调度器应该满足：

1、调度器分配的CPU时间不能太长，太长的话每个进程都可能做到一次运行结束，会导致其他的程序响应延迟，难以保证公平性，而且并发的概念基本上不存在了。

2、调度器分配的时间也不能太短，每次调度会导致上下文切换，频繁的切换会造成系统开销很大。举个例子，假如切换的时间是1ms，调度器分配的时间片也是1ms。那么时间都用在切换上了，就没有时间去执行进程了。

3.进程调度的优先级

为了协调多个进程的“同时”运行，可以给进程定义优先级，用优先级来直接区分执行顺序。如果有多个进程同时处于可执行状态，比较这几个进程的优先级，谁的优先级高谁就先执行。Linux根据特定的算法计算出进程的优先级，用一个值表示，这个值表示如何适当的把进程分配给CPU。一般分配有两种方式：

由用户程序指定。
由内核的调度程序动态调整。较长时间未分配到CPU的进程，提高优先级，已经在CPU上运行较长时间的进程，降低优先级。
Linux中用nice值表示，范围是从-20到+19，默认为0，越大的nice值代表更低的优先级。

4.调度器类

 1 struct sched_class {  //定义了调度器应该实现的函数，每一个具体的调度器类都要实现这些函数
 2     const struct sched_class *next;
 3 
 4     void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup,
 5                   bool head); //向就绪队列添加一个进程，该操作发生在一个进程变成就绪态（可运行态）
 6     void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);//执行enqueue_task的逆操作，在一个进程由运行态转为阻塞的时候就会发生该操作
 7     void (*yield_task) (struct rq *rq);//进程自愿放弃控制权
 8 
 9     void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags);
10 
11     struct task_struct * (*pick_next_task) (struct rq *rq);//用于挑选下一个可运行的进程，发生在进程调度的时候，由调度器调用
12     void (*put_prev_task) (struct rq *rq, struct task_struct *p);
13 
14 #ifdef CONFIG_SMP
15     int  (*select_task_rq)(struct task_struct *p, int sd_flag, int flags);
16 
17     void (*pre_schedule) (struct rq *this_rq, struct task_struct *task);
18     void (*post_schedule) (struct rq *this_rq);
19     void (*task_waking) (struct rq *this_rq, struct task_struct *task);
20     void (*task_woken) (struct rq *this_rq, struct task_struct *task);
21 
22     void (*set_cpus_allowed)(struct task_struct *p,
23                  const struct cpumask *newmask);
24 
25     void (*rq_online)(struct rq *rq);
26     void (*rq_offline)(struct rq *rq);
27 #endif
28 
29     void (*set_curr_task) (struct rq *rq);//当进程的调度策略发生变化时，需要执行此函数
30     void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);//每次激活周期调度器时，由周期调度器调用
31     void (*task_fork) (struct task_struct *p);//用于建立fork系统调用和调度器之间的关联，每次新进程建立后，就调用该函数通知调度器
32 
33     void (*switched_from) (struct rq *this_rq, struct task_struct *task,
34                    int running);
35     void (*switched_to) (struct rq *this_rq, struct task_struct *task,
36                  int running);
37     void (*prio_changed) (struct rq *this_rq, struct task_struct *task,
38                  int oldprio, int running);
39 
40     unsigned int (*get_rr_interval) (struct rq *rq,
41                      struct task_struct *task);
42 
43 #ifdef CONFIG_FAIR_GROUP_SCHED
44     void (*moved_group) (struct task_struct *p, int on_rq);
45 #endif
46 };

依据其调度策略的不同实现了5个调度器类
- stop_sched_class：发生在cpu_stop_cpu_callback 进行cpu之间任务migration，HOTPLUG_CPU的情况下关闭任务，不需要调度普通进程。
- dl_sched_class：采用EDF最早截至时间优先算法调度实时进程，对应调度策略为SCHED_DEADLINE。
- rt_sched_class：采用提供 Roound-Robin算法或者FIFO算法调度实时进程，具体调度策略由进程的task_struct->policy指定，对应的调度策略为SCHED_FIFO, SCHED_RR。
- fair_sched_clas：采用CFS算法调度普通的非实时进程，对应的调度策略为SCHED_NORMAL, SCHED_BATCH。
- idle_sched_class：采用CFS算法调度idle进程, 每个cup的第一个pid=0线程：swapper，是一个静态线程。调度类属于：idel_sched_class，所以在ps里面是看不到的。一般运行在开机过程和cpu异常的时候做dump，对应的调度策略为SCHED_IDLE。
调度的过程：
- 清理当前运行中的进程（prev）
- 选择下一个投入运行的进程（next）
- 设置新的运行环境执行进程
- 上下文切换
- 后期整理

5.调度算法(以CFS为例)

5.1CFS调度器

CFS(completely fair scheduler)完全公平调度器，是最终被内核采纳的调度器。它从RSDL/SD中吸取了完全公平的思想，不再跟踪进程的睡眠时间，也不再企图区分交互式进程。它将所有的进程都统一对待，这就是公平的含义，对应应用设置的调度策略SCHED_NORMAL/SCHED_BATCH。

5.2核心思想

在一个真实的硬件上模型化一个理想的、精确的多任务CPU。该理想CPU模型运行在100%的负荷、在精确平等速度下并行运行每个任务，每个任务运行在1/n速度下，即理想CPU有n个任务运行，每个任务的速度为CPU整个负荷的1/n。

进程的运行时间计算公式为:分配给进程的运行时间 = 调度周期 * 进程权重 / 所有进程权重之和
vruntime = 实际运行时间 * 1024/ 进程权重
综上两个公式，vruntime=调度周期*1024/所有进程的总权重
使用vruntime（进程的虚拟运行时间）来比较进程的运行时间，公平的选择进程。

5.3公平的体现

①不再区分进程类型，所有进程公平对待，但不是绝对的平等

②对I/O消耗型进程，仍然会提供快速响应(对睡眠进程做时间补偿)

③优先级高的进程，获得CPU时间更多(vruntime增长的更慢)

5.4工作原理

CFS 调度程序使用安抚（appeasement）策略确保公平性。当某个任务进入运行队列后，将记录当前时间，当某个进程等待 CPU 时，将对这个进程的 wait_runtime 值加一个数，这个数取决于运行队列当前的进程数。当执行这些计算时，也将考虑不同任务的优先级值。将这个任务调度到 CPU 后，它的 wait_runtime 值开始递减，当这个值递减到其他任务成为红黑树的最左侧任务时，当前任务将被抢占。通过这种方式，CFS 努力实现一种理想状态，即 wait_runtime 值为 0！

5.5相关结构

调度实体sched_entity，它代表一个调度单位。用于实现对单个任务或任务组的调用。调度器类sched_class。

 1 struct sched_entity {
 2     struct load_weight    load;        /* for load-balancing *///当前进程的权重
 3     struct rb_node        run_node;　　//红黑树的节点
 4     struct list_head    group_node; 
 5     unsigned int        on_rq;　　    //当前进程在就绪队列中，就置为1
 6 
 7     u64            exec_start;//进程开始时间
 8     u64            sum_exec_runtime;//进程运行总时间
 9     u64            vruntime;//进程虚拟运行时间
10     u64            prev_sum_exec_runtime;
11 　　 /* 进程前一次投入运行的总实际时间 */
12     u64            last_wakeup;
13     u64            avg_overlap;
14 
15     u64            nr_migrations;
16 
17     u64            start_runtime;
18     u64            avg_wakeup;
19 
20 #ifdef CONFIG_SCHEDSTATS
21     u64            wait_start;
22     u64            wait_max;
23     u64            wait_count;
24     u64            wait_sum;
25     u64            iowait_count;
26     u64            iowait_sum;
27 
28     u64            sleep_start;
29     u64            sleep_max;
30     s64            sum_sleep_runtime;
31 
32     u64            block_start;
33     u64            block_max;
34     u64            exec_max;
35     u64            slice_max;
36 
37     u64            nr_migrations_cold;
38     u64            nr_failed_migrations_affine;
39     u64            nr_failed_migrations_running;
40     u64            nr_failed_migrations_hot;
41     u64            nr_forced_migrations;
42 
43     u64            nr_wakeups;
44     u64            nr_wakeups_sync;
45     u64            nr_wakeups_migrate;
46     u64            nr_wakeups_local;
47     u64            nr_wakeups_remote;
48     u64            nr_wakeups_affine;
49     u64            nr_wakeups_affine_attempts;
50     u64            nr_wakeups_passive;
51     u64            nr_wakeups_idle;
52 #endif

 1 struct sched_class {
 2     const struct sched_class *next; //指向下一个调度器单位
 3 
 4     void (*enqueue_task) (struct rq *rq, struct task_struct *p, int wakeup,
 5                   bool head);//把一个进程加入就绪队列
 6     void (*dequeue_task) (struct rq *rq, struct task_struct *p, int sleep);//把一个进程从就绪队列中移除
 7     void (*yield_task) (struct rq *rq);//当前进程让出CPU，从就绪队列中选择其他进程投入运行
 8 
 9     void (*check_preempt_curr) (struct rq *rq, struct task_struct *p, int flags);//检查就绪队列中是否有优先级高于当前进程的进程
10 
11     struct task_struct * (*pick_next_task) (struct rq *rq);//从就绪队列选择一个优先级最高的进程
12     void (*put_prev_task) (struct rq *rq, struct task_struct *p);//把当前进程换出前，更新统计信息
13 
14 #ifdef CONFIG_SMP
15     int  (*select_task_rq)(struct task_struct *p, int sd_flag, int flags);
16 
17     void (*pre_schedule) (struct rq *this_rq, struct task_struct *task);
18     void (*post_schedule) (struct rq *this_rq);
19     void (*task_waking) (struct rq *this_rq, struct task_struct *task);
20     void (*task_woken) (struct rq *this_rq, struct task_struct *task);
21 
22     void (*set_cpus_allowed)(struct task_struct *p,
23                  const struct cpumask *newmask);
24 
25     void (*rq_online)(struct rq *rq);
26     void (*rq_offline)(struct rq *rq);
27 #endif
28 
29     void (*set_curr_task) (struct rq *rq);
30     void (*task_tick) (struct rq *rq, struct task_struct *p, int queued);
31     void (*task_fork) (struct task_struct *p);
32 
33     void (*switched_from) (struct rq *this_rq, struct task_struct *task,
34                    int running);
35     void (*switched_to) (struct rq *this_rq, struct task_struct *task,
36                  int running);
37     void (*prio_changed) (struct rq *this_rq, struct task_struct *task,
38                  int oldprio, int running);
39 
40     unsigned int (*get_rr_interval) (struct rq *rq,
41                      struct task_struct *task);
42 
43 #ifdef CONFIG_FAIR_GROUP_SCHED
44     void (*moved_group) (struct task_struct *p, int on_rq);
45 #endif
46 };

调度实体的组织，红黑树。所有的sched_entity以vruntime为key，树的最左侧节点，也就是vruntime最小的节点，只有等待CPU的就绪态进程在这棵树上，睡眠进程和正在运行的进程都不在树上。每一个CPU维护一个调度队列cfs_rq，用来保存相关的红黑树的信息。

 1 struct cfs_rq {
 2     struct load_weight load;//当前就绪队列进程的总权重
 3     unsigned long nr_running;//当前就绪队列中的总进程数
 4 
 5     u64 exec_clock;//实际时钟
 6     u64 min_vruntime; //当前队列中运行时间最小的虚拟时间
 7 
 8     struct rb_root tasks_timeline;
 9     struct rb_node *rb_leftmost;//红黑树最左子数节点
10 
11     struct list_head tasks;
12     struct list_head *balance_iterator;
13 
14     /*
15      * 'curr' points to currently running entity on this cfs_rq.
16      * It is set to NULL otherwise (i.e when none are currently running).
17      */
18     struct sched_entity *curr, *next, *last;
19 
20     unsigned int nr_spread_over;
21 
22 #ifdef CONFIG_FAIR_GROUP_SCHED
23     struct rq *rq;    /* cpu runqueue to which this cfs_rq is attached */
24 
25     /*
26      * leaf cfs_rqs are those that hold tasks (lowest schedulable entity in
27      * a hierarchy). Non-leaf lrqs hold other higher schedulable entities
28      * (like users, containers etc.)
29      *
30      * leaf_cfs_rq_list ties together list of leaf cfs_rq's in a cpu. This
31      * list is used during load balance.
32      */
33     struct list_head leaf_cfs_rq_list;
34     struct task_group *tg;    /* group that "owns" this runqueue */
35 
36 #ifdef CONFIG_SMP
37     /*
38      * the part of load.weight contributed by tasks
39      */
40     unsigned long task_weight;
41 
42     /*
43      *   h_load = weight * f(tg)
44      *
45      * Where f(tg) is the recursive weight fraction assigned to
46      * this group.
47      */
48     unsigned long h_load;
49 
50     /*
51      * this cpu's part of tg->shares
52      */
53     unsigned long shares;
54 
55     /*
56      * load.weight at the time we set shares
57      */
58     unsigned long rq_weight;
59 #endif
60 #endif
61 };

5.6调度操作

update_curr()函数：更新当前进程的信息，包括获取当前进程已经执行的时间，统计进程运行的总时间等，更新红黑树中最左节点的虚拟时钟。

 1 static void update_curr(struct cfs_rq *cfs_rq)
 2 {
 3     struct sched_entity *curr = cfs_rq->curr;
 4     u64 now = rq_of(cfs_rq)->clock;
 5     unsigned long delta_exec;
 6 
 7     if (unlikely(!curr))
 8         return;
 9 
10     /*
11      * Get the amount of time the current task was running
12      * since the last time we changed load (this cannot
13      * overflow on 32 bits):
14      */
15     delta_exec = (unsigned long)(now - curr->exec_start);
16     if (!delta_exec)
17         return;
18 
19     __update_curr(cfs_rq, curr, delta_exec);
20     curr->exec_start = now;
21 
22     if (entity_is_task(curr)) {
23         struct task_struct *curtask = task_of(curr);
24 
25         trace_sched_stat_runtime(curtask, delta_exec, curr->vruntime);
26         cpuacct_charge(curtask, delta_exec);
27         account_group_exec_runtime(curtask, delta_exec);
28     }
29 }

scheduler_tick()：每一次时钟中断都会调用，同时调用update_curr()函数更新当前进程的信息，判断是否需要进程调度，如果需要，则申请调度。

schedule()：当收到调度请求的时候，或者进程主动释放资源，就将当前进程加入就绪队列中，然后从就绪队列中选择下一个进程，切换到该进程。

 1 asmlinkage void __sched schedule(void)
 2 {
 3     struct task_struct *prev, *next;  //当前进程，下一个进程
 4     unsigned long *switch_count;
 5     struct rq *rq;
 6     int cpu;
 7 
 8 need_resched:
 9     preempt_disable();
10     cpu = smp_processor_id();
11     rq = cpu_rq(cpu);
12     rcu_sched_qs(cpu);
13     prev = rq->curr;
14     switch_count = &prev->nivcsw;
15 
16     release_kernel_lock(prev);
17 need_resched_nonpreemptible:
18 
19     schedule_debug(prev);
20 
21     if (sched_feat(HRTICK))
22         hrtick_clear(rq);
23 
24     raw_spin_lock_irq(&rq->lock);
25     update_rq_clock(rq);
26     clear_tsk_need_resched(prev);
27 
28     if (prev->state && !(preempt_count() & PREEMPT_ACTIVE)) {
29         if (unlikely(signal_pending_state(prev->state, prev)))
30             prev->state = TASK_RUNNING;
31         else
32             deactivate_task(rq, prev, 1);
33         switch_count = &prev->nvcsw;
34     }
35 
36     pre_schedule(rq, prev);
37 
38     if (unlikely(!rq->nr_running))
39         idle_balance(cpu, rq);
40 
41     put_prev_task(rq, prev);
42     next = pick_next_task(rq);
43 
44     if (likely(prev != next)) {
45         sched_info_switch(prev, next);
46         perf_event_task_sched_out(prev, next);
47 
48         rq->nr_switches++;
49         rq->curr = next;
50         ++*switch_count;
51                 //完成进程切换
52         context_switch(rq, prev, next); /* unlocks the rq */
53     ……

3. 对操作系统进程模型的看法

linux系统本身是一个多进程的系统，每个进程之间具有并行性。进程作为linux的基本调度单位，控制着cpu和其他系统资源的访问，这就产生了不同类型的进程，也要求进程能根据不同的用户行为或者系统任务进行相应的处理，所以才有了多种调度策略及调度算法的产生和演变。了解了进程模型的结构组织，我们才能了解一个程序在运行过程中，启动了什么进程，和第三方进程进行了通信等等操作。

4.参考资料

https://wenku.baidu.com/view/444184f601f69e31433294cb.html

https://blog.csdn.net/gatieme/article/details/51702662

https://www.cnblogs.com/hanxiaoyu/p/5576277.html

https://mirrors.edge.kernel.org/pub/linux/kernel/v2.6/linux-2.6.34.1.tar.gz

posted on 2018-04-27 13:23 吴毅超阅读(252) 评论(0) 收藏举报