LXR | KVM | PM | Time | Interrupt | Systems Performance | Bootup Optimization

Linux中断管理 (1)Linux中断管理机制

目录:

Linux中断管理

Linux中断管理 (1)Linux中断管理机制

Linux中断管理 (2)软中断和tasklet

Linux中断管理 (3)workqueue工作队列

 

关键词:GIC、IAR、EOI、SGI/PPI/SPI、中断映射、中断异常向量、中断上下文、内核中断线程、中断注册

 

由于篇幅较大,简单梳理一下内容。

本章主要可以分为三大部分:

讲解硬件背景的1. ARM中断控制器

系统初始化的静态过程:GIC初始化和各中断的中断号映射2. 硬件中断号和Linux中断号的映射;每个中断的注册5. 注册中断

一个中断从产生到执行完毕的动态过程:ARM底层通用部分如何处理3. ARM底层中断处理;GIC部分的处理流程以及上层通用处理部分4. 高层中断处理

这里的高层处理,没有包括下半部。下半部在Linux中断管理 (2)软中断和taskletLinux中断管理 (3)workqueue工作队列中进行介绍。

1. ARM中断控制器

 

1.1 ARM支持中断类型

ARM GIC-v2支持三种类型的中断:

SGI:软件触发中断(Software Generated Interrupt),通常用于多核间通讯,最多支持16个SGI中断,硬件中断号从ID0~ID15。SGI通常在Linux内核中被用作IPI中断(inter-processor interrupts),并会送达到系统指定的CPU上

PPI:私有外设中断(Private Peripheral Interrupt),是每个CPU私有的中断。最多支持16个PPI中断,硬件中断号从ID16~ID31。PPI通常会送达到指定的CPU上,应用场景有CPU本地时钟。

SPI:公用外设中断(Shared Peripheral Interrupt),最多可以支持988个外设中断,硬件中断号从ID32~ID1019。

 

1.2 GIC检测中断流程

GIC主要由两部分组成,分别是仲裁单元(Distributor)和CPU接口模块。

GIC仲裁单元为每一个中断维护一个状态机,分别是:inactive、pending、active and pending、active

下面是来自IHI0048B GIC-V2规格书3.2.4 Interrupt handling state machine截图:

GIC检测中断流程如下:

(1) 当GIC检测到一个中断发生时,会将该中断标记为pending状态(A1)。

(2) 对处于pending状态的中断,仲裁单元回确定目标CPU,将中断请求发送到这个CPU上。

(3) 对于每个CPU,仲裁单元会从众多pending状态的中断中选择一个优先级最高的中断,发送到目标CPU的CPU Interface模块上。

(4) CPU Interface会决定这个中断是否可以发送给CPU。如果该终端优先级满足要求,GIC会发生一个中断信号给该CPU。

(5) 当一个CPU进入中断异常后,会去读取GICC_IAR寄存器来响应该中断(一般是Linux内核的中断处理程序来读寄存器)。寄存器会返回硬件中断号(hardware interrupt ID),对于SGI中断来说是返回源CPU的ID。

     当GIC感知到软件读取了该寄存器后,又分为如下情况:

     * 如果该中断源是pending状态,那么转改将变成active。(C)    

     * 如果该中断又重新产生,那么pending状态变成active and pending。(D)

     * 如果该中断是active状态,现在变成active and pending。(A2)

(6) 当处理器完成中断服务,必须发送一个完成信号EOI(End Of Interrupt)给GIC控制器。软件写GICC_EOIR寄存器,状态变成inactive。(E1)

补充:

(7) 对于level triggered类型中断来说,当触发电平消失,状态从active and pending变成active。(B2)

常用路径是A1->D->B2->E1。

1.2.1 GIC中断抢占

GIC中断控制器支持中断优先级抢占,一个高优先级中断可以抢占一个低优先级且处于active状态的中断,即GIC仲裁单元会记录和比较当前优先级最高的pending状态,然后去抢占当前中断,并且发送这个最高优先级的中断请求给CPU,CPU应答了高优先级中断,暂停低优先级中断服务,进而去处理高优先级中断。

GIC会将pending状态优先级最高的中断请求发送给CPU。

1.2.2 Linux对中断抢占处理

从GIC角度看,GIC会发送高优先级中断请求给CPU。

但是目前CPU处于关中断状态,需要等低优先级中断处理完毕,直到发送EOI给GIC。

然后CPU才会响应pending状态中优先级最高的中断进行处理。

所以Linux下:

1. 高优先级中断无法抢占正在执行的低优先级中断。

2.同处于pending状态的中断,优先响应高优先级中断进行处理。

1.3 GIC中断时序

 

借助GIC-400 Figure B-2 Signaling physical interrupts理解GIC内部工作原理。

M和N都是SPI类型的外设中断,且通过FIQ来处理,高电平触发,N的优先级比M高,他们的目标CPU相同。

(1) T1时刻:GIC的总裁单元检测到中断M的电平变化。

(2) T2时刻:仲裁单元设置中断M的状态为pending。

(3) T17时刻:CPU Interface模块会拉低nFIQCPU[n]信号。在中断M的状态变成pending后,大概需要15个时钟周期后会拉低nFIQCPU[n]信号来向CPU报告中断请求(assertion)。仲裁单元需要这些时间来计算哪个是pending状态下优先级最高的中断。

(4) T42时刻:仲裁单元检测到另外一个优先级更高的中断N。

(5) T43时刻:仲裁单元用中断N替换中断M为当前pending状态下优先级最高的中断,并设置中断N为pending状态。

(6) T58时刻:经过tph个时钟后,CPU Interface拉低你FIOCPU[n]信号来通知CPU。因为此信号在T17时刻已经被拉低,CPU Interface模块会更新GICC_IAR寄存器的Interrupt ID域,该域的值变成中断N的硬件中断号。

(7) T61~T131时刻:Linux对中断N的服务程序--------------------------------------------------------------中断服务程序处理段,从GICC_IAR开始到GICC_EOIR结束。

  T61时刻:CPU(Linux中断服务例程)读取GICC_IAR寄存器,即软件响应了中断N。这时仲裁单元把中断N的状态从pending变成active and pending。读取GICC_IAR

  T64时刻:在中断N被Linux相应3个时钟内,CPU Interface模块完成对nFIQCPU[n]信号的deasserts,即拉高nFIQCPU[n]信号。

  T126时刻:外设也deassert了该中断N。

  T128时刻:仲裁单元移出了中断N的pending状态。

  T131时刻:Linux服务程序把中断N的硬件ID号写入GICC_EOIR寄存器来完成中断N的全部处理过程。写GICC_EOIR

(8) T146时刻:在向GICC_EOIR寄存器写入中断N中断号后的tph个时钟后,仲裁单元会选择下一个最高优先级中断,即中断M,发送中断请求给CPU Interface模块。CPU Interface会拉低nFIQCPU[n]信号来向CPU报告外设M的中断请求。

(9) T211时刻:Linux中断服务程序读取GICC_IAR寄存器来响应中断,仲裁单元设置中断M的状态为active and pending。

(10) T214时刻:在CPU响应中断后的3个时钟内,CPU Interface模块拉高nFIOCPU[n]信号来完成deassert动作。

 

那么GICC_IAR和GICC_EOIR分别在Linux什么地方触发的呢?

 

 

1.4 Cortex A15 A7实例

 

 

2. 硬件中断号和Linux中断号的映射

 

2.1 硬件中断号:一个串口中断实例

 

2.2 中断控制器初始化

DTS中GIC定义于arch/arm/boot/dts/vexpress-v2p-ca15_a7.dts:

    gic: interrupt-controller@2c001000 {
        compatible = "arm,cortex-a15-gic", "arm,cortex-a9-gic";------------------此设备的标识符是"arm,cortex-a15-gic"
        #interrupt-cells = <3>;
        #address-cells = <0>;
        interrupt-controller;----------------------------------------------------表示此设备是一个中断控制器
        reg = <0 0x2c001000 0 0x1000>,
              <0 0x2c002000 0 0x1000>,
              <0 0x2c004000 0 0x2000>,
              <0 0x2c006000 0 0x2000>;
        interrupts = <1 9 0xf04>;
    };

 

struct irq_domain用于描述一个中断控制器。

GIC中断控制器在初始化时解析DTS信息中定义了几个GIC控制器,每个GIC控制器注册一个struct irq_domain数据结构。

 

struct irq_domain {
    struct list_head link;-------------------------用于将irq_domain连接到全局链表irq_domain_list中。
    const char *name;------------------------------中断控制器名称
    const struct irq_domain_ops *ops;--------------irq domain映射操作使用的方法集合
    void *host_data;
    unsigned int flags;

    /* Optional data */
    struct device_node *of_node;------------------对应中断控制器的device node
    struct irq_domain_chip_generic *gc;
#ifdef    CONFIG_IRQ_DOMAIN_HIERARCHY
    struct irq_domain *parent;
#endif

    /* reverse map data. The linear map gets appended to the irq_domain */
    irq_hw_number_t hwirq_max;--------------------该irq domain支持中断数量的最大值。
    unsigned int revmap_direct_max_irq;
    unsigned int revmap_size;---------------------线性映射的大小
    struct radix_tree_root revmap_tree;-----------Radix Tree映射的根节点
    unsigned int linear_revmap[];-----------------线性映射用到的lookup table
}

 

 

 struct irq_domain_ops定义了irq_domain方法集合,xlate从intspec中解析出硬件中断号和中断类型,intspec[0]和intspec[1]决定中断号,intspec[2]决定中断类型。

struct irq_domain_ops {
    int (*match)(struct irq_domain *d, struct device_node *node);
    int (*map)(struct irq_domain *d, unsigned int virq, irq_hw_number_t hw);
    void (*unmap)(struct irq_domain *d, unsigned int virq);
    int (*xlate)(struct irq_domain *d, struct device_node *node,
             const u32 *intspec, unsigned int intsize,
             unsigned long *out_hwirq, unsigned int *out_type);

#ifdef    CONFIG_IRQ_DOMAIN_HIERARCHY
    /* extended V2 interfaces to support hierarchy irq_domains */
    int (*alloc)(struct irq_domain *d, unsigned int virq,
             unsigned int nr_irqs, void *arg);
    void (*free)(struct irq_domain *d, unsigned int virq,
             unsigned int nr_irqs);
    void (*activate)(struct irq_domain *d, struct irq_data *irq_data);
    void (*deactivate)(struct irq_domain *d, struct irq_data *irq_data);
#endif
};

static const struct irq_domain_ops gic_irq_domain_hierarchy_ops = {
    .xlate = gic_irq_domain_xlate,
    .alloc = gic_irq_domain_alloc,
    .free = irq_domain_free_irqs_top,
};

static int gic_irq_domain_xlate(struct irq_domain *d,
                struct device_node *controller,
                const u32 *intspec, unsigned int intsize,
                unsigned long *out_hwirq, unsigned int *out_type)
{
...
    /* Get the interrupt number and add 16 to skip over SGIs */
    *out_hwirq = intspec[1] + 16;--------------------------------------首先+16跳过SGI类型中断

    /* For SPIs, we need to add 16 more to get the GIC irq ID number */
    if (!intspec[0]) {-------------------------------------------------如果是SPI类型中断,还需要+16,跳过PPI类型中断。
        ret = gic_routable_irq_domain_ops->xlate(d, controller,
                             intspec,
                             intsize,
                             out_hwirq,
                             out_type);

        if (IS_ERR_VALUE(ret))
            return ret;
    }

    *out_type = intspec[2] & IRQ_TYPE_SENSE_MASK;---------------------中断触发类型,包括四种上升沿、下降沿、高电平、低电平。

    return ret;
}

static int gic_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
                unsigned int nr_irqs, void *arg)
{
    int i, ret;
    irq_hw_number_t hwirq;
    unsigned int type = IRQ_TYPE_NONE;
    struct of_phandle_args *irq_data = arg;

    ret = gic_irq_domain_xlate(domain, irq_data->np, irq_data->args,
                   irq_data->args_count, &hwirq, &type);---------------首先根据args翻译出硬件中断号和中断类型。
    if (ret)
        return ret;

    for (i = 0; i < nr_irqs; i++)
        gic_irq_domain_map(domain, virq + i, hwirq + i);---------------执行软硬件的映射,并且根据中断类型设置struct irq_desc->handle_irq处理函数。

    return 0;
}

void irq_domain_free_irqs_top(struct irq_domain *domain, unsigned int virq,
                  unsigned int nr_irqs)
{
    int i;

    for (i = 0; i < nr_irqs; i++) {
        irq_set_handler_data(virq + i, NULL);
        irq_set_handler(virq + i, NULL);
    }
    irq_domain_free_irqs_common(domain, virq, nr_irqs);
}

 针对SPI类型中断,需要进行+16位移。

static int gic_routable_irq_domain_xlate(struct irq_domain *d,
                struct device_node *controller,
                const u32 *intspec, unsigned int intsize,
                unsigned long *out_hwirq,
                unsigned int *out_type)
{
    *out_hwirq += 16;
    return 0;
}

 

 gic_irq_domain_map()入参有struct irq_domain和软硬件中断号,主要分SGI/PPI一组,SPI一组。

主要工作由irq_domain_set_info()处理,irq_domain_set_hwirq_and_chip()通过Linux中断号获取struct irq_data数据结构,设置关联硬件中断号和struct irq_chip gic_chip关联。

__irq_set_handler()设置中断描述符irq_desc->handler_irq回调函数,对SPI类型来说就是handle_fasteoi_irq()。

 

static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,
                irq_hw_number_t hw)
{
    if (hw < 32) {
        irq_set_percpu_devid(irq);-------------------------------PerCPU类型的中断有自己的特殊flag。
        irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
                    handle_percpu_devid_irq, NULL, NULL);
        set_irq_flags(irq, IRQF_VALID | IRQF_NOAUTOEN);
    } else {
        irq_domain_set_info(d, irq, hw, &gic_chip, d->host_data,
                    handle_fasteoi_irq, NULL, NULL);
        set_irq_flags(irq, IRQF_VALID | IRQF_PROBE);

        gic_routable_irq_domain_ops->map(d, irq, hw);
    }
    return 0;
}

void irq_domain_set_info(struct irq_domain *domain, unsigned int virq,
             irq_hw_number_t hwirq, struct irq_chip *chip,
             void *chip_data, irq_flow_handler_t handler,
             void *handler_data, const char *handler_name)
{
    irq_domain_set_hwirq_and_chip(domain, virq, hwirq, chip, chip_data);
    __irq_set_handler(virq, handler, 0, handler_name);
    irq_set_handler_data(virq, handler_data);
}

int irq_domain_set_hwirq_and_chip(struct irq_domain *domain, unsigned int virq,
                  irq_hw_number_t hwirq, struct irq_chip *chip,
                  void *chip_data)
{
    struct irq_data *irq_data = irq_domain_get_irq_data(domain, virq);

    if (!irq_data)
        return -ENOENT;

    irq_data->hwirq = hwirq;
    irq_data->chip = chip ? chip : &no_irq_chip;
    irq_data->chip_data = chip_data;

    return 0;
}

void
__irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
          const char *name)
{
    unsigned long flags;
    struct irq_desc *desc = irq_get_desc_buslock(irq, &flags, 0);
...
    desc->handle_irq = handle;--------------------irq_desc->handler_irq和name赋值。
    desc->name = name;
...
}

 

drivers/irqchip/irq-gic.c定义了"arm,cortex-a15-gic"的处理函数gic_of_init,gic_of_init是GIC控制器的初始化函数。

IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);

static int gic_cnt __initdata;

static int __init
gic_of_init(struct device_node *node, struct device_node *parent)
{
...
    gic_init_bases(gic_cnt, -1, dist_base, cpu_base, percpu_offset, node);
    if (!gic_cnt)
        gic_init_physaddr(node);

    if (parent) {
        irq = irq_of_parse_and_map(node, 0);
        gic_cascade_irq(gic_cnt, irq);
    }

    if (IS_ENABLED(CONFIG_ARM_GIC_V2M))
        gicv2m_of_init(node, gic_data[gic_cnt].domain);

    gic_cnt++;
    return 0;
}

 

 gic_init_bases的gic_nr是GIC控制器的序号,主要调用irq_domain_add_linear()分配并函数注册一个irq_domain。

 

void __init gic_init_bases(unsigned int gic_nr, int irq_start,
               void __iomem *dist_base, void __iomem *cpu_base,
               u32 percpu_offset, struct device_node *node)
{
    irq_hw_number_t hwirq_base;
    struct gic_chip_data *gic;
    int gic_irqs, irq_base, i;
    int nr_routable_irqs;

    BUG_ON(gic_nr >= MAX_GIC_NR);---------------------------gic_nr不超过系统规定的MAX_GIC_NR

    gic = &gic_data[gic_nr];--------------------------------struct gic_chip_data类型的全局变量gic_data,序号是GIC控制器序号
...
/*
     * Initialize the CPU interface map to all CPUs.
     * It will be refined as each CPU probes its ID.
     */
    for (i = 0; i < NR_GIC_CPU_IF; i++)
        gic_cpu_map[i] = 0xff;

    /*
     * Find out how many interrupts are supported.
     * The GIC only supports up to 1020 interrupt sources.
     */
    gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;------------计算GIC控制器最多支持的中断源个数
    gic_irqs = (gic_irqs + 1) * 32;
    if (gic_irqs > 1020)----------------------------------------------------------------GIC支持的最大中断数据,此处为1020
        gic_irqs = 1020;
    gic->gic_irqs = gic_irqs;

    if (node) {        /* DT case */
        const struct irq_domain_ops *ops = &gic_irq_domain_hierarchy_ops;--------------GICv2的struct irq_domain_ops
...
        gic->domain = irq_domain_add_linear(node, gic_irqs, ops, gic);-----------------注册irq_domain,操作函数使用gic_irq_domain_hierarchy_ops
    } else {        /* Non-DT case */
...
    }

    if (WARN_ON(!gic->domain))
        return;

    if (gic_nr == 0) {
#ifdef CONFIG_SMP
        set_smp_cross_call(gic_raise_softirq);
        register_cpu_notifier(&gic_cpu_notifier);
#endif
        set_handle_irq(gic_handle_irq);-------在irq_handler中调用handle_arch_irq,这里将handle_arch_irq指向gic_handle_irq,实现了平台中断和具体GIC中断的关联。
    }

    gic_chip.flags |= gic_arch_extn.flags;
    gic_dist_init(gic);----------------------GIC Distributer部分初始化
    gic_cpu_init(gic);-----------------------GIC CPU Interface部分初始化
    gic_pm_init(gic);------------------------GIC PM相关初始化
}

 

 irq_domain_add_linear()->__irq_domain_add()分配并初始化struct irq_domain。

  

struct irq_domain *__irq_domain_add(struct device_node *of_node, int size,
                    irq_hw_number_t hwirq_max, int direct_max,
                    const struct irq_domain_ops *ops,
                    void *host_data)
{
    struct irq_domain *domain;

    domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
                  GFP_KERNEL, of_node_to_nid(of_node));-------------domain大小为struct irq_domain加上gic_irqs个unsigned int。
    if (WARN_ON(!domain))
        return NULL;

    /* Fill structure */
    INIT_RADIX_TREE(&domain->revmap_tree, GFP_KERNEL);
    domain->ops = ops;
    domain->host_data = host_data;
    domain->of_node = of_node_get(of_node);
    domain->hwirq_max = hwirq_max;
    domain->revmap_size = size;
    domain->revmap_direct_max_irq = direct_max;
    irq_domain_check_hierarchy(domain);

    mutex_lock(&irq_domain_mutex);
    list_add(&domain->link, &irq_domain_list);----------------------将创建好的struct irq_domain加入全局链表irq_domain_list。
    mutex_unlock(&irq_domain_mutex);

    pr_debug("Added domain %s\n", domain->name);
    return domain;
}

  

2.3 系统初始化之中断号映射

 上一小节是中断控制器GIC的初始化,下面看看一个硬件中断是如何映射到Linux空间的中断的。

customize_machine()是arch_initcall阶段调用,很靠前。

 customize_machine

  ->of_platform_populate

    ->of_platform_bus_create

      ->of_amba_device_create

        ->of_amba_device_create

下面结合dtsi文件看看来龙去脉,arch/arm/boot/dts/vexpress-v2m.dtsi。

 

/dts-v1/;

/ {
    model = "V2P-CA9";
    arm,hbi = <0x191>;
    arm,vexpress,site = <0xf>;
    compatible = "arm,vexpress,v2p-ca9", "arm,vexpress";
    interrupt-parent = <&gic>;
    #address-cells = <1>;
    #size-cells = <1>;
...
    gic: interrupt-controller@1e001000 {
        compatible = "arm,cortex-a9-gic";
        #interrupt-cells = <3>;
        #address-cells = <0>;
        interrupt-controller;
        reg = <0x1e001000 0x1000>,
              <0x1e000100 0x100>;
    };
...
    smb {
        compatible = "simple-bus";

        #address-cells = <2>;
        #size-cells = <1>;
        ranges = <0 0 0x40000000 0x04000000>,
             <1 0 0x44000000 0x04000000>,
             <2 0 0x48000000 0x04000000>,
             <3 0 0x4c000000 0x04000000>,
             <7 0 0x10000000 0x00020000>;

        #interrupt-cells = <1>;
        interrupt-map-mask = <0 0 63>;
        interrupt-map = <0 0  0 &gic 0  0 4>,
                <0 0  1 &gic 0  1 4>,
...
/include/ "vexpress-v2m.dtsi"
    };
}

vexpress-v2m.dtsi文件:
motherboard { model
= "V2M-P1"; arm,hbi = <0x190>; arm,vexpress,site = <0>; compatible = "arm,vexpress,v2m-p1", "simple-bus"; #address-cells = <2>; /* SMB chipselect number and offset */ #size-cells = <1>; #interrupt-cells = <1>; ranges; ... iofpga@7,00000000 { compatible = "arm,amba-bus", "simple-bus"; #address-cells = <1>; #size-cells = <1>; ranges = <0 7 0 0x20000>; ... v2m_serial0: uart@09000 { compatible = "arm,pl011", "arm,primecell"; reg = <0x09000 0x1000>; interrupts = <5>; clocks = <&v2m_oscclk2>, <&smbclk>; clock-names = "uartclk", "apb_pclk"; }; ... }; }

 

这里首先从根目录下查找"simple-bus",从上面可以看出指向smb设备。

smb设备包含vexpress-v2m.dtsi文件,然后在of_platform_bus_create()中遍历所有设备。

const struct of_device_id of_default_bus_match_table[] = {
    { .compatible = "simple-bus", },
#ifdef CONFIG_ARM_AMBA
    { .compatible = "arm,amba-bus", },
#endif /* CONFIG_ARM_AMBA */
    {} /* Empty terminated list */
};


static int __init customize_machine(void)
{
...
        of_platform_populate(NULL, of_default_bus_match_table,-----------------找到匹配"simple-bus"的设备,这里指向smb。
                    NULL, NULL);
...
}


int of_platform_populate(struct device_node *root,
            const struct of_device_id *matches,
            const struct of_dev_auxdata *lookup,
            struct device *parent)
{
...
    for_each_child_of_node(root, child) {
        rc = of_platform_bus_create(child, matches, lookup, parent, true);-----这里的root指向根目录,即"/"。
        if (rc)
            break;
    }
...
}

static int of_platform_bus_create(struct device_node *bus,
                  const struct of_device_id *matches,
                  const struct of_dev_auxdata *lookup,
                  struct device *parent, bool strict)
{
    const struct of_dev_auxdata *auxdata;
    struct device_node *child;
    struct platform_device *dev;
    const char *bus_id = NULL;
    void *platform_data = NULL;
    int rc = 0;

    /* Make sure it has a compatible property */
    if (strict && (!of_get_property(bus, "compatible", NULL))) {
        pr_debug("%s() - skipping %s, no compatible prop\n",
             __func__, bus->full_name);
        return 0;
    }

    auxdata = of_dev_lookup(lookup, bus);
    if (auxdata) {
        bus_id = auxdata->name;
        platform_data = auxdata->platform_data;
    }

    if (of_device_is_compatible(bus, "arm,primecell")) {------当遇到匹配"arm,primecell"设备,创建amba设备。在ofpga@7,00000000中创建uart@09000设备。
        /*
         * Don't return an error here to keep compatibility with older
         * device tree files.
         */
        of_amba_device_create(bus, bus_id, platform_data, parent);
        return 0;
    }

    dev = of_platform_device_create_pdata(bus, bus_id, platform_data, parent);
    if (!dev || !of_match_node(matches, bus))
        return 0;

    for_each_child_of_node(bus, child) {----------------遍历smb下的所有"simple-bus"设备,这里可以嵌套几层。从smb->motherboard->iofpga@7,00000000。
        pr_debug("   create child: %s\n", child->full_name);
        rc = of_platform_bus_create(child, matches, lookup, &dev->dev, strict);
        if (rc) {
            of_node_put(child);
            break;
        }
    }
    of_node_set_flag(bus, OF_POPULATED_BUS);
    return rc;
}

 

 of_amba_device_create创建ARM AMBA类型设备,其中中断部分交给irq_of_parse_and_map()处理。

 

static struct amba_device *of_amba_device_create(struct device_node *node,
                         const char *bus_id,
                         void *platform_data,
                         struct device *parent)
{
...
    /* Decode the IRQs and address ranges */
    for (i = 0; i < AMBA_NR_IRQS; i++)
        dev->irq[i] = irq_of_parse_and_map(node, i);
...
}

 

以uart@09000为例,irq_of_parse_and_map中的of_irq_parse_one()解析设备中的"interrupts"、"regs"等参数,参数放入struct of_phandle_args中,oirq->args[1]中存放中断号5,oirq->np存放struct device_node。

irq_create_of_mapping()建立硬件中断号到Linux中断号的映射。

irq_create_of_mapping主要调用如下,主要工作交给__irq_domain_alloc_irqs()进行处理。

irq_create_of_mapping

  ->domain->ops->xlate---------------------------------

  ->irq_find_mapping

  ->irq_domain_alloc_irqs

    ->__irq_domain_alloc_irqs

      ->irq_domain_alloc_descs

      ->irq_domain_alloc_irq_data

      ->irq_domain_alloc_irqs_recursive

        ->gic_irq_domain_alloc

          ->gic_irq_domain_map-----------------------进行硬件中断号和软件中断号的映射

            ->gic_irq_domain_set_info----------------设置重要参数到中断描述符中

      ->irq_domain_insert_irq

 

unsigned int irq_of_parse_and_map(struct device_node *dev, int index)
{
    struct of_phandle_args oirq;

    if (of_irq_parse_one(dev, index, &oirq))
        return 0;

    return irq_create_of_mapping(&oirq);
}

unsigned int irq_create_of_mapping(struct of_phandle_args *irq_data)
{
    struct irq_domain *domain;
    irq_hw_number_t hwirq;
    unsigned int type = IRQ_TYPE_NONE;
    int virq;

    domain = irq_data->np ? irq_find_host(irq_data->np) : irq_default_domain;---找到设备所属的struct irq_domain结构体。
...
    /* If domain has no translation, then we assume interrupt line */
    if (domain->ops->xlate == NULL)
        hwirq = irq_data->args[0];
    else {
        if (domain->ops->xlate(domain, irq_data->np, irq_data->args,-------调用gic_irq_domain_xlate()函数进行硬件中断号到Linux中断号的转换。
                    irq_data->args_count, &hwirq, &type))
            return 0;
    }

    if (irq_domain_is_hierarchy(domain)) {-------------------------可以分层挂载
        /*
         * If we've already configured this interrupt,
         * don't do it again, or hell will break loose.
         */
        virq = irq_find_mapping(domain, hwirq);-------------------从已有的linear_revmap中寻找Linux中断号。
        if (virq)
            return virq;

        virq = irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, irq_data);---------如果没有找到,重新分配中断映射。参数1表示每次只分配一个中断。
        if (virq <= 0)
            return 0;
    } else {
...
    }

    /* Set type if specified and different than the current one */
    if (type != IRQ_TYPE_NONE &&
        type != irq_get_trigger_type(virq))
        irq_set_irq_type(virq, type);-----------------------------设置中断触发类型
    return virq;
}

 

struct irq_desc定义了中断描述符,irq_desc[]数组定义了NR_IRQS个中断描述符,数组下标表示IRQ中断号,通过IRQ中断号可以找到对应中断描述符。

struct irq_desc内置了struct irq_data结构体,struct irq_data的irq和hwirq分别对应软件中断号和硬件中断号。通过这两个成员,可以将硬件中断号和软件中断号映射起来。

struct irq_chip定义了中断控制器底层操作相关的方法集合。

struct irq_desc {
    struct irq_data        irq_data;
    unsigned int __percpu    *kstat_irqs;
    irq_flow_handler_t    handle_irq;-----------------根据中断号分类,不同类型中断的处理handle。0~31对应handle_percpu_devid_irq;32~对应handle_fasteoi_irq。
#ifdef CONFIG_IRQ_PREFLOW_FASTEOI
    irq_preflow_handler_t    preflow_handler;
#endif
    struct irqaction    *action;    /* IRQ action list */
    unsigned int        status_use_accessors;
    unsigned int        core_internal_state__do_not_mess_with_it;
    unsigned int        depth;        /* nested irq disables */
    unsigned int        wake_depth;    /* nested wake enables */
    unsigned int        irq_count;    /* For detecting broken IRQs */
    unsigned long        last_unhandled;    /* Aging timer for unhandled count */
    unsigned int        irqs_unhandled;
    atomic_t        threads_handled;
    int            threads_handled_last;
    raw_spinlock_t        lock;
    struct cpumask        *percpu_enabled;
#ifdef CONFIG_SMP
    const struct cpumask    *affinity_hint;
    struct irq_affinity_notify *affinity_notify;
#ifdef CONFIG_GENERIC_PENDING_IRQ
    cpumask_var_t        pending_mask;
#endif
#endif
    unsigned long        threads_oneshot;-------------是一个位图,每个比特位代表正在处理的共享oneshot类型中断的中断线程。
    atomic_t        threads_active;-------------------表示正在运行的中断线程个数
    wait_queue_head_t       wait_for_threads;
#ifdef CONFIG_PM_SLEEP
    unsigned int        nr_actions;
    unsigned int        no_suspend_depth;
    unsigned int        cond_suspend_depth;
    unsigned int        force_resume_depth;
#endif
#ifdef CONFIG_PROC_FS
    struct proc_dir_entry    *dir;
#endif
    int            parent_irq;
    struct module        *owner;
    const char        *name;
}

struct irq_data {
    u32            mask;
    unsigned int        irq;-----------------Linux软件中断号
    unsigned long        hwirq;--------------硬件中断号
    unsigned int        node;
    unsigned int        state_use_accessors;
    struct irq_chip        *chip;
    struct irq_domain    *domain;
#ifdef    CONFIG_IRQ_DOMAIN_HIERARCHY
    struct irq_data        *parent_data;
#endif
    void            *handler_data;
    void            *chip_data;
    struct msi_desc        *msi_desc;
    cpumask_var_t        affinity;
}

struct irq_chip {
    const char    *name;
    unsigned int    (*irq_startup)(struct irq_data *data);-------------初始化中断
    void        (*irq_shutdown)(struct irq_data *data);----------------结束中断
    void        (*irq_enable)(struct irq_data *data);------------------使能中断
    void        (*irq_disable)(struct irq_data *data);-----------------关闭中断

    void        (*irq_ack)(struct irq_data *data);---------------------应答中断
    void        (*irq_mask)(struct irq_data *data);--------------------屏蔽中断
    void        (*irq_mask_ack)(struct irq_data *data);----------------应答并屏蔽中断
    void        (*irq_unmask)(struct irq_data *data);------------------解除中断屏蔽
    void        (*irq_eoi)(struct irq_data *data);---------------------发送EOI信号,表示硬件中断处理已经完成。

    int        (*irq_set_affinity)(struct irq_data *data, const struct cpumask *dest, bool force);--------绑定中断到某个CPU
    int        (*irq_retrigger)(struct irq_data *data);----------------重新发送中断到CPU
    int        (*irq_set_type)(struct irq_data *data, unsigned int flow_type);----------------------------设置触发类型
    int        (*irq_set_wake)(struct irq_data *data, unsigned int on);-----------------------------------使能/关闭中断在电源管理中的唤醒功能。

    void        (*irq_bus_lock)(struct irq_data *data);
    void        (*irq_bus_sync_unlock)(struct irq_data *data);

    void        (*irq_cpu_online)(struct irq_data *data);
    void        (*irq_cpu_offline)(struct irq_data *data);

    void        (*irq_suspend)(struct irq_data *data);
    void        (*irq_resume)(struct irq_data *data);
    void        (*irq_pm_shutdown)(struct irq_data *data);
...
    unsigned long    flags;
}

 gic_chip是特定中断控制器的硬件操作函数集,对于GICv2有屏蔽/去屏蔽、EOI、设置中断触发类型、以及设置或者当前芯片状态。

static const struct irq_chip gic_chip = {
    .irq_mask        = gic_mask_irq,
    .irq_unmask        = gic_unmask_irq,
    .irq_eoi        = gic_eoi_irq,
    .irq_set_type        = gic_set_type,
    .irq_get_irqchip_state    = gic_irq_get_irqchip_state,
    .irq_set_irqchip_state    = gic_irq_set_irqchip_state,
    .flags            = IRQCHIP_SET_TYPE_MASKED |
                  IRQCHIP_SKIP_SET_WAKE |
                  IRQCHIP_MASK_ON_SUSPEND,
};

static void gic_mask_irq(struct irq_data *d)
{
    gic_poke_irq(d, GIC_DIST_ENABLE_CLEAR);
}

static void gic_unmask_irq(struct irq_data *d)
{
    gic_poke_irq(d, GIC_DIST_ENABLE_SET);
}

static void gic_eoi_irq(struct irq_data *d)
{
    writel_relaxed(gic_irq(d), gic_cpu_base(d) + GIC_CPU_EOI);
}

static int gic_set_type(struct irq_data *d, unsigned int type)
{
    void __iomem *base = gic_dist_base(d);
    unsigned int gicirq = gic_irq(d);

    /* Interrupt configuration for SGIs can't be changed */
    if (gicirq < 16)
        return -EINVAL;

    /* SPIs have restrictions on the supported types */
    if (gicirq >= 32 && type != IRQ_TYPE_LEVEL_HIGH &&
                type != IRQ_TYPE_EDGE_RISING)
        return -EINVAL;

    return gic_configure_irq(gicirq, type, base, NULL);
}

static int gic_irq_set_irqchip_state(struct irq_data *d,
                     enum irqchip_irq_state which, bool val)
{
    u32 reg;

    switch (which) {
    case IRQCHIP_STATE_PENDING:
        reg = val ? GIC_DIST_PENDING_SET : GIC_DIST_PENDING_CLEAR;
        break;

    case IRQCHIP_STATE_ACTIVE:
        reg = val ? GIC_DIST_ACTIVE_SET : GIC_DIST_ACTIVE_CLEAR;
        break;

    case IRQCHIP_STATE_MASKED:
        reg = val ? GIC_DIST_ENABLE_CLEAR : GIC_DIST_ENABLE_SET;
        break;

    default:
        return -EINVAL;
    }

    gic_poke_irq(d, reg);
    return 0;
}

static int gic_irq_get_irqchip_state(struct irq_data *d,
                      enum irqchip_irq_state which, bool *val)
{
    switch (which) {
    case IRQCHIP_STATE_PENDING:
        *val = gic_peek_irq(d, GIC_DIST_PENDING_SET);
        break;

    case IRQCHIP_STATE_ACTIVE:
        *val = gic_peek_irq(d, GIC_DIST_ACTIVE_SET);
        break;

    case IRQCHIP_STATE_MASKED:
        *val = !gic_peek_irq(d, GIC_DIST_ENABLE_SET);
        break;

    default:
        return -EINVAL;
    }

    return 0;
}

 

irq_domain_alloc_irqs()调用__irq_domain_alloc_irqs()进行struct irq_desc、struct irq_data以及中断映射的处理。

这里的参数nr_irqs一般为1,每次只处理一个中断。

irq_domain_alloc_descs()->irq_alloc_descs()->__irq_alloc_descs()进行struct irq_desc的分配,返回的参数是Linux中断号。

int __irq_domain_alloc_irqs(struct irq_domain *domain, int irq_base,
                unsigned int nr_irqs, int node, void *arg,
                bool realloc)
{
...
    if (realloc && irq_base >= 0) {
        virq = irq_base;
    } else {
        virq = irq_domain_alloc_descs(irq_base, nr_irqs, 0, node);-------从allocated_irqs位图中查找第一个nr_irqs个空闲的比特位,最终调用__irq_alloc_descsif (virq < 0) {
            pr_debug("cannot allocate IRQ(base %d, count %d)\n",
                 irq_base, nr_irqs);
            return virq;
        }
    }

    if (irq_domain_alloc_irq_data(domain, virq, nr_irqs)) {--------------分配struct irq_data数据结构。
        pr_debug("cannot allocate memory for IRQ%d\n", virq);
        ret = -ENOMEM;
        goto out_free_desc;
    }

    mutex_lock(&irq_domain_mutex);
    ret = irq_domain_alloc_irqs_recursive(domain, virq, nr_irqs, arg);----调用struct irq_domain中的alloc回调函数进行硬件中断号和软件中断号的映射。
    if (ret < 0) {
        mutex_unlock(&irq_domain_mutex);
        goto out_free_irq_data;
    }
    for (i = 0; i < nr_irqs; i++)
        irq_domain_insert_irq(virq + i);
    mutex_unlock(&irq_domain_mutex);

    return virq;
...
}

int __ref
__irq_alloc_descs(int irq, unsigned int from, unsigned int cnt, int node,
          struct module *owner)
{
...
    mutex_lock(&sparse_irq_lock);

    start = bitmap_find_next_zero_area(allocated_irqs, IRQ_BITMAP_BITS,
                       from, cnt, 0);-------------------在allocated_irqs位图中查找第一个连续cnt个为0的比特位区域。
...
    bitmap_set(allocated_irqs, start, cnt);-------------bitmap_set()设置这些比特位,表示这些比特位已经被占用。
    mutex_unlock(&sparse_irq_lock);
    return alloc_descs(start, cnt, node, owner);--------这里要看是否定义了CONFIG_SPARSE_IRQ,如果定义了需要动态分配一个struct irq_desc数据结构,以Radix Tree方式存储;没有的话则从irq_desc全局变量中加上偏移即可。

err:
    mutex_unlock(&sparse_irq_lock);
    return ret;
}

 

 irq_domain_alloc_irqs_recursive()会根据实际情况决定中断控制器的递归处理,

static int irq_domain_alloc_irqs_recursive(struct irq_domain *domain,
                       unsigned int irq_base,
                       unsigned int nr_irqs, void *arg)
{
    int ret = 0;
    struct irq_domain *parent = domain->parent;
    bool recursive = irq_domain_is_auto_recursive(domain);

    BUG_ON(recursive && !parent);
    if (recursive)
        ret = irq_domain_alloc_irqs_recursive(parent, irq_base,
                              nr_irqs, arg);
    if (ret >= 0)
        ret = domain->ops->alloc(domain, irq_base, nr_irqs, arg);
    if (ret < 0 && recursive)
        irq_domain_free_irqs_recursive(parent, irq_base, nr_irqs);

    return ret;
}

 至此完成了中断DeviceTree的解析,各数据结构的初始化,以及最主要的硬件中断号到Linux中断号的映射。

 

 

 

3. ARM底层中断处理

 ARM底层中断处理的范围是从中断异常触发,到irq_handler。

3.1 中断硬件行为

外设有事件需要报告SoC时,通过和SoC链接的中断管脚发送中断信号,可能是边沿触发信号也可能是电平触发信号。

中断控制器会感知中断信号,中断控制器仲裁单元选择优先级最高的中断发送到CPU Interface,CPU Interface决定将中断分发到哪个CPU核心。

GIC控制器和CPU核心之间通过一个nIRQ(IRQ request input line)信号来通知CPU。

CPU核心感知到中断发生之后,硬件会做如下工作:

  • 保存中断发生时CPSR寄存器内容到SPSR_irq寄存器中
  • 修改CPSR寄存器,让CPU进入处理器模式(processor mode)中的IRQ模式,即修改CPSR寄存器中的M域设置为IRQ Mode。
  • 硬件自动关闭中断IRQ或FIQ,即CPSR中的IRQ位或FIQ位置1。------------硬件自动关中断
  • 保存返回地址到LR_irq寄存器中。
  • 硬件自动调转到中断向量表的IRQ向量。-------------------------------------------从此处开始进入软件领域

当从中断返回时需要软件实现如下操作:

  • SPSR_irq寄存器中恢复数据到CPSR中。
  • LR_irq中恢复内容到PC中,从而返回到中断点的下一个指令处执行。

 

3.2 中断异常向量

3.2.1 中断异常向量代码段初始化

 内核编译时,异常向量表存放在可执行文件的__init段中:arch/arm/kernel/vmlinux.lds.S。

__vectors_start和__vectors_end指向vectors段的开始和结束地址,__stubs_start和__stubs_end存放异常向量stubs代码段。两者都是页面对齐,大小都为一个页面。

    __vectors_start = .;
    .vectors 0 : AT(__vectors_start) {
        *(.vectors)----------------------------------保存.vectors段数据
    }
    . = __vectors_start + SIZEOF(.vectors);
    __vectors_end = .;

    __stubs_start = .;
    .stubs 0x1000 : AT(__stubs_start) {
        *(.stubs)------------------------------------存放.stubs段数据
    }
    . = __stubs_start + SIZEOF(.stubs);
    __stubs_end = .;

 

系统初始化时会把上述两个段复制到高端地址处,即ixffff_0000:start_kernel->setup_arch->paging_init->devicemap_init。

static void __init devicemaps_init(const struct machine_desc *mdesc)
{
    struct map_desc map;
    unsigned long addr;
    void *vectors;

    /*
     * Allocate the vector page early.
     */
    vectors = early_alloc(PAGE_SIZE * 2);-------------------------------分配两个页面用于映射到high vectors高端地址。

    early_trap_init(vectors);-------------------------------------------实现异常向量表的复制动作。...
    /*
     * Create a mapping for the machine vectors at the high-vectors
     * location (0xffff0000).  If we aren't using high-vectors, also
     * create a mapping at the low-vectors virtual address.
     */
    map.pfn = __phys_to_pfn(virt_to_phys(vectors));---------------------vectors物理页面号
    map.virtual = 0xffff0000;-------------------------------------------待映射到的虚拟地址0xffff_0000~0xffff_0fff
    map.length = PAGE_SIZE;---------------------------------------------映射区间大小
#ifdef CONFIG_KUSER_HELPERS
    map.type = MT_HIGH_VECTORS;-----------------------------------------映射到high vector
#else
    map.type = MT_LOW_VECTORS;
#endif
    create_mapping(&map);

    if (!vectors_high()) {
        map.virtual = 0;
        map.length = PAGE_SIZE * 2;
        map.type = MT_LOW_VECTORS;
        create_mapping(&map);
    }

    /* Now create a kernel read-only mapping */
    map.pfn += 1;
    map.virtual = 0xffff0000 + PAGE_SIZE;------------------------------映射到0xffff_1000~0xffff_1ffff
    map.length = PAGE_SIZE;
    map.type = MT_LOW_VECTORS;
    create_mapping(&map);
...
}

 

early_trap_init分别将__vectors_start和__stubs_start两个页面复制到分配的两个页面中。

void __init early_trap_init(void *vectors_base)
{
...
    unsigned long vectors = (unsigned long)vectors_base;
    extern char __stubs_start[], __stubs_end[];
    extern char __vectors_start[], __vectors_end[];
    unsigned i;

    vectors_page = vectors_base;

    /*
     * Poison the vectors page with an undefined instruction.  This
     * instruction is chosen to be undefined for both ARM and Thumb
     * ISAs.  The Thumb version is an undefined instruction with a
     * branch back to the undefined instruction.
     */
    for (i = 0; i < PAGE_SIZE / sizeof(u32); i++)
        ((u32 *)vectors_base)[i] = 0xe7fddef1;---------------------------第一个页面全部填充未定义指令0xe7fddef1。

    /*
     * Copy the vectors, stubs and kuser helpers (in entry-armv.S)
     * into the vector page, mapped at 0xffff0000, and ensure these
     * are visible to the instruction stream.
     */
    memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
    memcpy((void *)vectors + 0x1000, __stubs_start, __stubs_end - __stubs_start);
...
}

 

 

3.2.2 中断异常向量

中断发生后,软件跳转到中断向量表开始vector_irq执行,vector_irq在结尾的时候根据中断发生点所在模式,决定跳转到__irq_usr或者__irq_svc。

vector_irq在arch/arm/kernel/entry-armv.S由宏vector_stub定义。

 

关于correction==4,需要减去4字节才是返回地址?

vector_stub宏参数correction为4,。

正在执行指令A时发生了中断,由于ARM流水线和指令预取等原因,pc指向A+8B处,那么必须等待指令A执行完毕才能处理该中断,这时PC已经更新到A+12B处。

进入中断响应前夕,pc寄存器的内容被装入lr寄存器中,lr=pc-4,即A+8B地址处。

因此返回时要pc=lr-4,才是被中断时要执行的下一条指令。所以lr要回退4B。

 

    .section .vectors, "ax", %progbits
__vectors_start: W(b) vector_rst W(b) vector_und W(ldr) pc, __vectors_start
+ 0x1000 W(b) vector_pabt W(b) vector_dabt W(b) vector_addrexcptn W(b) vector_irq---------------------------------------------------------------跳转到vector_irq W(b) vector_fiq /* * Interrupt dispatcher */ vector_stub irq, IRQ_MODE, 4------------------------------------------------vector_stub宏定义了vector_irq .long __irq_usr @ 0 (USR_26 / USR_32) .long __irq_invalid @ 1 (FIQ_26 / FIQ_32) .long __irq_invalid @ 2 (IRQ_26 / IRQ_32) .long __irq_svc @ 3 (SVC_26 / SVC_32)----------------------------svc模式数值是0b10011,与上0xf后就是3。 .long __irq_invalid @ 4 .long __irq_invalid @ 5 .long __irq_invalid @ 6 .long __irq_invalid @ 7 .long __irq_invalid @ 8 .long __irq_invalid @ 9 .long __irq_invalid @ a .long __irq_invalid @ b .long __irq_invalid @ c .long __irq_invalid @ d .long __irq_invalid @ e .long __irq_invalid @ f



      .macro vector_stub, name, mode, correction=0------------------------------------vector_stub宏定义

    .align 5
vector_\name:
    .if \correction
    sub    lr, lr, #\correction-------------------------------------------------------correction==4解释
    .endif

    @
    @ Save r0, lr_<exception> (parent PC) and spsr_<exception>
    @ (parent CPSR)
    @
    stmia    sp, {r0, lr}        @ save r0, lr
    mrs    lr, spsr
    str    lr, [sp, #8]        @ save spsr

    @
    @ Prepare for SVC32 mode.  IRQs remain disabled.
    @
    mrs    r0, cpsr
    eor    r0, r0, #(\mode ^ SVC_MODE | PSR_ISETSTATE)---------------------------------修改CPSR寄存器的控制域为SVC模式,为了使中断处理在SVC模式下执行。
    msr    spsr_cxsf, r0

    @
    @ the branch table must immediately follow this code
    @
    and    lr, lr, #0x0f--------------------------------------------------------------低4位反映了进入中断前CPU的运行模式,9为USR,3为SVC模式。
 THUMB(    adr    r0, 1f            )
 THUMB(    ldr    lr, [r0, lr, lsl #2]    )-------------------------------------------根据中断发生点所在的模式,给lr寄存器赋值,__irq_usr或者__irq_svc标签处。
    mov    r0, spk
 ARM(    ldr    lr, [pc, lr, lsl #2]    )---------------------------------------------得到的lr就是".long __irq_svc"
    movs    pc, lr            @ branch to handler in SVC mode-------------------------把lr的值赋给pc指针,跳转到__irq_usr或者__irq_svc。
ENDPROC(vector_\name)

 

 

3.3 内核空间中断处理__irq_svc

 __irq_svc处理发生在内核空间的中断,主要svc_entry保护中断现场;irq_handler执行中断处理;如果打开抢占功能,检查是否可以抢占;最后svc_exit执行中断退出处理。

__irq_svc:
    svc_entry
    irq_handler

#ifdef CONFIG_PREEMPT-----------------------------------------------------中断处理结束后,发生抢占的地方
    get_thread_info tsk
    ldr    r8, [tsk, #TI_PREEMPT]        @ get preempt count--------------获取thread_info->preempt_cpunt变量;preempt_count为0,说明可以抢占进程;preempt_count大于0,表示不能抢占。
    ldr    r0, [tsk, #TI_FLAGS]        @ get flags------------------------获取thread_info->flags变量
    teq    r8, #0                @ if preempt count != 0
    movne    r0, #0                @ force flags to 0
    tst    r0, #_TIF_NEED_RESCHED-----------------------------------------判断是否设置了_TIF_NEED_RESCHED标志位
    blne    svc_preempt
#endif

    svc_exit r5, irq = 1            @ return from exception
 UNWIND(.fnend        )
ENDPROC(__irq_svc)

 

svc_entry将中断现场保存到内核栈中,主要是struct pt_regs中的寄存器。

    .macro    svc_entry, stack_hole=0, trace=1
 UNWIND(.fnstart        )
 UNWIND(.save {r0 - pc}        )
    sub    sp, sp, #(S_FRAME_SIZE + \stack_hole - 4)
#ifdef CONFIG_THUMB2_KERNEL
 SPFIX(    str    r0, [sp]    )    @ temporarily saved
 SPFIX(    mov    r0, sp        )
 SPFIX(    tst    r0, #4        )    @ test original stack alignment
 SPFIX(    ldr    r0, [sp]    )    @ restored
#else
 SPFIX(    tst    sp, #4        )
#endif
 SPFIX(    subeq    sp, sp, #4    )
    stmia    sp, {r1 - r12}

    ldmia    r0, {r3 - r5}
    add    r7, sp, #S_SP - 4    @ here for interlock avoidance
    mov    r6, #-1            @  ""  ""      ""       ""
    add    r2, sp, #(S_FRAME_SIZE + \stack_hole - 4)
 SPFIX(    addeq    r2, r2, #4    )
    str    r3, [sp, #-4]!        @ save the "real" r0 copied
                    @ from the exception stack

    mov    r3, lr

    @
    @ We are now ready to fill in the remaining blanks on the stack:
    @
    @  r2 - sp_svc
    @  r3 - lr_svc
    @  r4 - lr_<exception>, already fixed up for correct return/restart
    @  r5 - spsr_<exception>
    @  r6 - orig_r0 (see pt_regs definition in ptrace.h)
    @
    stmia    r7, {r2 - r6}

    .if \trace
#ifdef CONFIG_TRACE_IRQFLAGS
    bl    trace_hardirqs_off
#endif
    .endif
    .endm

 

svc_exit准备返回中断现场,然后通过ldmia指令从栈中恢复15个寄存器,包括pc内容,至此整个中断完成并返回。

    .macro    svc_exit, rpsr, irq = 0...
    msr    spsr_cxsf, \rpsr
    ldmia    sp, {r0 - pc}^            @ load r0 - pc, cpsr
    .endm

 

irq_handler进入高层中断处理。

 

4. 高层中断处理

irq_handler汇编宏是ARCH层和高层中断处理分割线,在这里从汇编跳转到C进行GIC相关处理。

前面介绍了一个中断是如何从硬件中断号映射到Linux中断号的,那么当一个中断产生后它从应将到软件识别中断号,再到转换成Linux中断号是什么路径呢?

这里就从irq_handler开始分析流程:

irq_handler()

  ->handle_arch_irq()->gic_handle_irq()

    ->handle_domain_irq()->__handle_domain_irq()-------------读取IAR寄存器,响应中断,获取硬件中断号

      ->irq_find_mapping()------------------------------------------------将硬件中断号转变成Linux中断号

      ->generic_handle_irq()---------------------------------------------之后的操作都是Linux中断号

        ->handle_percpu_devid_irq()-----------------------------------SGI/PPI类型中断处理

        ->handle_fasteoi_irq()--------------------------------------------SPI类型中断处理

          ->handle_irq_event()->handle_irq_event_percpu()------执行中断处理核心函数

            ->action->handler-----------------------------------------------执行primary handler。

            ->__irq_wake_thread()----------------------------------------根据需要唤醒中断内核线程

 

4.1 irq_handler

 irq_handler宏调用handle_arch_irq函数,这个函数set_handle_irq注册,GICv2对应gic_handle_irq。

    .macro    irq_handler
#ifdef CONFIG_MULTI_IRQ_HANDLER
    ldr    r1, =handle_arch_irq
    mov    r0, sp
    adr    lr, BSYM(9997f)
    ldr    pc, [r1]
#else
    arch_irq_handler_default
#endif
9997:
    .endm

  

4.2 gic_handle_irq

 

git_init_bases设置handle_arch_irq为gic_handle_irq。

void __init gic_init_bases(unsigned int gic_nr, int irq_start,
               void __iomem *dist_base, void __iomem *cpu_base,
               u32 percpu_offset, struct device_node *node)
{
...
    if (gic_nr == 0) {
...
        set_handle_irq(gic_handle_irq);
    }
...
}

void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))
{
    if (handle_arch_irq)
        return;

    handle_arch_irq = handle_irq;
}

 gic_handle_irq对将中断分为两组:SGI、PPI/SPI。

SGI类型中断交给handle_IPI()处理;PPI/SPI类型交给handle_domain_irq处理。

static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
{
    u32 irqstat, irqnr;
    struct gic_chip_data *gic = &gic_data[0];
    void __iomem *cpu_base = gic_data_cpu_base(gic);

    do {
        irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);---读取IAR寄存器,表示响应中断。
        irqnr = irqstat & GICC_IAR_INT_ID_MASK;-----------------GICC_IAR_INT_ID_MASK为0x3ff,即低10位,所以中断最多从0~1023。

        if (likely(irqnr > 15 && irqnr < 1021)) {
            handle_domain_irq(gic->domain, irqnr, regs);
            continue;
        }
        if (irqnr < 16) {---------------------------------------SGI类型的中断是CPU核间通信所用,只有定义了CONFIG_SMP才有意义。
            writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);----直接写EOI寄存器,表示结束中断。
#ifdef CONFIG_SMP
            handle_IPI(irqnr, regs);----------------------------irqnr表示SGI中断类型
#endif
            continue;
        }
        break;
    } while (1);
}

 

handle_domain_irq调用__handle_domain_irq,其中lookup置为true。

irq_enter显式告诉Linux内核现在要进入中断上下文了,在处理完中断后调用irq_exit告诉Linux已经完成中断处理过程。

int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,
            bool lookup, struct pt_regs *regs)
{
    struct pt_regs *old_regs = set_irq_regs(regs);
    unsigned int irq = hwirq;
    int ret = 0;

    irq_enter();-----------------------------------------------通过显式增加hardirq域计数,通知Linux进入中断上下文

#ifdef CONFIG_IRQ_DOMAIN
    if (lookup)
        irq = irq_find_mapping(domain, hwirq);-----------------根据硬件中断号找到对应的软件中断号
#endif

    /*
     * Some hardware gives randomly wrong interrupts.  Rather
     * than crashing, do something sensible.
     */
    if (unlikely(!irq || irq >= nr_irqs)) {
        ack_bad_irq(irq);
        ret = -EINVAL;
    } else {
        generic_handle_irq(irq);--------------------------------开始具体某一个中断的处理,此处irq已经是Linux中断号。
    }

    irq_exit();-------------------------------------------------退出中断上下文
    set_irq_regs(old_regs);
    return ret;
}

 

irq_find_mapping在struct irq_domain中根据hwirq找到Linux环境的irq。

unsigned int irq_find_mapping(struct irq_domain *domain,
                  irq_hw_number_t hwirq)
{
    struct irq_data *data;
...
    /* Check if the hwirq is in the linear revmap. */
    if (hwirq < domain->revmap_size)
        return domain->linear_revmap[hwirq];----------------linear_revmap[]在__irq_domain_alloc_irqs()->irq_domain_insert_irq()时赋值。
...
}

 

generic_handle_irq参数是irq号,irq_to_desc()根据irq号找到对应的struct irq_desc。

然后调用irq_desc->handle_irq处理对应的中断。

int generic_handle_irq(unsigned int irq)
{
    struct irq_desc *desc = irq_to_desc(irq);

    if (!desc)
        return -EINVAL;
    generic_handle_irq_desc(irq, desc);
    return 0;
}

static inline void generic_handle_irq_desc(unsigned int irq, struct irq_desc *desc)
{
    desc->handle_irq(irq, desc);
}

 

关于desc->handle_irq来历,在每个中断注册的时候,由gic_irq_domain_map根据hwirq号决定。

 

gic_irq_domain_map的时候根据hw号决定handle,hw硬件中断号小于32指向handle_percpu_devid_irq,其他情况指向handle_fasteoi_irq

void
__irq_set_handler(unsigned int irq, irq_flow_handler_t handle, int is_chained,
          const char *name)
{
...
    desc->handle_irq = handle;
    desc->name = name;
...
}

 

 handle_percpu_devid_irq处理0~31的SGI/PPI类型中断,首先响应IAR,然后执行handler,最后发送EOI。

void handle_percpu_devid_irq(unsigned int irq, struct irq_desc *desc)
{
    struct irq_chip *chip = irq_desc_get_chip(desc);
    struct irqaction *action = desc->action;
    void *dev_id = raw_cpu_ptr(action->percpu_dev_id);
    irqreturn_t res;

    kstat_incr_irqs_this_cpu(irq, desc);

    if (chip->irq_ack)
        chip->irq_ack(&desc->irq_data);

    trace_irq_handler_entry(irq, action);
    res = action->handler(irq, dev_id);
    trace_irq_handler_exit(irq, action, res);

    if (chip->irq_eoi)
        chip->irq_eoi(&desc->irq_data);-------------------调用gic_eoi_irq()函数
}

 

irq_enter和irq_exit显式地处理hardirq域计数,两者之间的部分属于中断上下文。

/*
 * Enter an interrupt context.
 */
void irq_enter(void)
{
    rcu_irq_enter();
    if (is_idle_task(current) && !in_interrupt()) {
        /*
         * Prevent raise_softirq from needlessly waking up ksoftirqd
         * here, as softirq will be serviced on return from interrupt.
         */
        local_bh_disable();
        tick_irq_enter();
        _local_bh_enable();
    }

    __irq_enter();---------------------------------------------显式增加hardirq域计数
}

#define __irq_enter()                    \
    do {                        \
        account_irq_enter_time(current);    \
        preempt_count_add(HARDIRQ_OFFSET);    \----------------显式增加hardirq域计数
        trace_hardirq_enter();            \
    } while (0)


void irq_exit(void)
{
#ifndef __ARCH_IRQ_EXIT_IRQS_DISABLED
    local_irq_disable();
#else
    WARN_ON_ONCE(!irqs_disabled());
#endif

    account_irq_exit_time(current);
    preempt_count_sub(HARDIRQ_OFFSET);---------------------------显式减少hardirq域计数
    if (!in_interrupt() && local_softirq_pending())--------------当前不处于中断上下文,且有pending的softirq,进行softirq处理。
        invoke_softirq();

    tick_irq_exit();
    rcu_irq_exit();
    trace_hardirq_exit(); /* must be last! */
}

 

 

4.2.1 中断上下文

判断当前进程是处于中断上下文,还是进程上下文依赖于preempt_count,这个变量在struct thread_info中。

preempt_count计数共32bit,从低到高依次是:

#define PREEMPT_BITS	8
#define SOFTIRQ_BITS	8
#define HARDIRQ_BITS	4
#define NMI_BITS	1

 

 

#define hardirq_count()    (preempt_count() & HARDIRQ_MASK)-----------------硬件中断计数
#define softirq_count()    (preempt_count() & SOFTIRQ_MASK)-----------------软中断计数
#define irq_count()    (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK \----包括NMI、硬中断、软中断三者计数
                 | NMI_MASK))

/*
 * Are we doing bottom half or hardware interrupt processing?
 *
 * in_irq()       - We're in (hard) IRQ context
 * in_softirq()   - We have BH disabled, or are processing softirqs
 * in_interrupt() - We're in NMI,IRQ,SoftIRQ context or have BH disabled
 * in_serving_softirq() - We're in softirq context
 * in_nmi()       - We're in NMI context
 * in_task()      - We're in task context
 *
 * Note: due to the BH disabled confusion: in_softirq(),in_interrupt() really
 *       should not be used in new code.
 */
#define in_irq()        (hardirq_count())----------------------------判断是否正在硬件中断上下文
#define in_softirq()        (softirq_count())------------------------判断是否正在处理软中断或者禁止BH。
#define in_interrupt()        (irq_count())--------------------------判断是否处于NMI、硬中断、软中断三者之一或者兼有上下文
#define in_serving_softirq()    (softirq_count() & SOFTIRQ_OFFSET)---判断是否处于软中断上下文。
#define in_nmi()        (preempt_count() & NMI_MASK)-----------------判断是否处于NMI上下文
#define in_task()        (!(preempt_count() & \
                   (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))------判断是否处于进程上下文

 

思考:in_softirq()和in_serving_softirq()区别?in_interrupt()和in_task()中关于SOFTIRQ_MASK和SOFTIRQ_OFFSET区别?

 

4.3 handle_fasteoi_irq

handle_fsteoi_irq处理SPI类型的中断,将主要工作交给handle_irq_event()。

handle_irq_event_percpu()首先处理action->handler,有需要则唤醒中断内核线程,执行action->thread_fn。

void
handle_fasteoi_irq(unsigned int irq, struct irq_desc *desc)
{
    struct irq_chip *chip = desc->irq_data.chip;

    raw_spin_lock(&desc->lock);

    if (!irq_may_run(desc))
        goto out;

    desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
    kstat_incr_irqs_this_cpu(irq, desc);

    /*
     * If its disabled or no action available
     * then mask it and get out of here:
     */
    if (unlikely(!desc->action || irqd_irq_disabled(&desc->irq_data))) {---如果该中断没有指定action描述符或该中断被关闭了IRQD_IRQ_DISABLED,设置该中断状态为IRQS_PENDING,且mask_irq()屏蔽该中断。
        desc->istate |= IRQS_PENDING;
        mask_irq(desc);
        goto out;
    }

    if (desc->istate & IRQS_ONESHOT)----------------------------------------如果中断是IRQS_ONESHOT,不支持中断嵌套,那么应该调用mask_irq()来屏蔽该中断源。
        mask_irq(desc);

    preflow_handler(desc);--------------------------------------------------取决于是否定义了freflow_handler()
    handle_irq_event(desc);

    cond_unmask_eoi_irq(desc, chip);----------------------------------------根据不同条件执行unmask_irq()解除中断屏蔽,或者执行irq_chip->irq_eoi发送EOI信号,通知GIC中断处理完毕。

    raw_spin_unlock(&desc->lock);
    return;
out:
    if (!(chip->flags & IRQCHIP_EOI_IF_HANDLED))
        chip->irq_eoi(&desc->irq_data);
    raw_spin_unlock(&desc->lock);
}

 

handle_irq_event调用handle_irq_event_percpu,执行action->handler(),如有需要唤醒内核中断线程执行action->thread_fn。

irqreturn_t handle_irq_event(struct irq_desc *desc)
{
    struct irqaction *action = desc->action;
    irqreturn_t ret;

    desc->istate &= ~IRQS_PENDING;--------------------------清除IRQS_PENDING标志位
    irqd_set(&desc->irq_data, IRQD_IRQ_INPROGRESS);---------设置IRQD_IRQ_INPROGRESS标志位,表示正在处理硬件中断。
    raw_spin_unlock(&desc->lock);

    ret = handle_irq_event_percpu(desc, action);

    raw_spin_lock(&desc->lock);
    irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);-------清除IRQD_IRQ_INPROGRESS标志位,表示中断处理结束。
    return ret;
}

irqreturn_t
handle_irq_event_percpu(struct irq_desc *desc, struct irqaction *action)
{
    irqreturn_t retval = IRQ_NONE;
    unsigned int flags = 0, irq = desc->irq_data.irq;

    do {----------------------------------------------------遍历中断描述符中的action链表,依次执行每个action元素中的primary handler回调函数action->handler。
        irqreturn_t res;

        trace_irq_handler_entry(irq, action);
        res = action->handler(irq, action->dev_id);---------执行struct irqaction的handler函数。
        trace_irq_handler_exit(irq, action, res);

        if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pF enabled interrupts\n",
                  irq, action->handler))
            local_irq_disable();---------------------------

        switch (res) {
        case IRQ_WAKE_THREAD:-------------------------------去唤醒内核中断线程
            /*
             * Catch drivers which return WAKE_THREAD but
             * did not set up a thread function
             */
            if (unlikely(!action->thread_fn)) {
                warn_no_thread(irq, action);----------------输出一个打印表示没有中断处理函数
                break;
            }

            __irq_wake_thread(desc, action);----------------唤醒此中断对应的内核线程

            /* Fall through to add to randomness */
        case IRQ_HANDLED:-----------------------------------已经处理完毕,可以结束。
            flags |= action->flags;
            break;

        default:
            break;
        }

        retval |= res;
        action = action->next;
    } while (action);

    add_interrupt_randomness(irq, flags);

    if (!noirqdebug)
        note_interrupt(irq, desc, retval);
    return retval;
}

 

4.3.1 唤醒中断内核线程

__irq_wake_thread唤醒对应中断的内核线程。

void __irq_wake_thread(struct irq_desc *desc, struct irqaction *action)
{
    /*
     * In case the thread crashed and was killed we just pretend that
     * we handled the interrupt. The hardirq handler has disabled the
     * device interrupt, so no irq storm is lurking.
     */
    if (action->thread->flags & PF_EXITING)
        return;

    /*
     * Wake up the handler thread for this action. If the
     * RUNTHREAD bit is already set, nothing to do.
     */
    if (test_and_set_bit(IRQTF_RUNTHREAD, &action->thread_flags))--------------若已经对IRQF_RUNTHREAD置位,表示已经处于唤醒中,该函数直接返回。
        return;

    desc->threads_oneshot |= action->thread_mask;--------------------thread_mask在共享中断中,每一个action有一个比特位来表示。thread_oneshot每个比特位表示正在处理的共享oneshot类型中断的中断线程。

    atomic_inc(&desc->threads_active);-------------------------------活跃中断线程计数

    wake_up_process(action->thread);---------------------------------唤醒action的thread内核线程
}

 

4.3.2 创建内核中断线程

irq_thread在中断注册的时候,如果条件满足同时创建rq/xx-xx内核中断线程,线程优先级是49(99-50),调度策略是SCHED_FIFO

static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
...
    /*
     * Create a handler thread when a thread function is supplied
     * and the interrupt does not nest into another interrupt
     * thread.
     */
    if (new->thread_fn && !nested) {
        struct task_struct *t;
        static const struct sched_param param = {
            .sched_priority = MAX_USER_RT_PRIO/2,-------------------------------设置irq内核线程的优先级,在/proc/xxx/sched中看到的prio为MAX_RT_PRIO-1-sched_priority。
        };

        t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
                   new->name);--------------------------------------------------创建线程名为irq/xxx-xxx的内核线程,线程执行函数是irq_thread。
...
        sched_setscheduler_nocheck(t, SCHED_FIFO, &param);----------------------设置进程调度策略为SCHED_FIFO。

        /*
         * We keep the reference to the task struct even if
         * the thread dies to avoid that the interrupt code
         * references an already freed task_struct.
         */
        get_task_struct(t);
        new->thread = t;-------------------------------------------------------将当前线程和irq_action关联起来

        set_bit(IRQTF_AFFINITY, &new->thread_flags);--------------------------对中断线程设置CPU亲和性
    }
...
}

 

4.3.3 内核中断线程执行

irq_thread是中断线程的执行函数,在irq_wait_for_interrupt()中等待。

irq_wait_for_interrupt()中判断IRQTF_RUNTHREAD标志位,没有置位则schedule()换出CPU,进行睡眠。

直到__irq_wake_thread()置位了IRQTF_RUNTHREAD,并且wake_up_process()后,irq_wait_for_interrupt()返回0。

 

static int irq_thread(void *data)
{
    struct callback_head on_exit_work;
    struct irqaction *action = data;
    struct irq_desc *desc = irq_to_desc(action->irq);
    irqreturn_t (*handler_fn)(struct irq_desc *desc,
            struct irqaction *action);

    if (force_irqthreads && test_bit(IRQTF_FORCED_THREAD,
                    &action->thread_flags))
        handler_fn = irq_forced_thread_fn;
    else
        handler_fn = irq_thread_fn;

    init_task_work(&on_exit_work, irq_thread_dtor);
    task_work_add(current, &on_exit_work, false);

    irq_thread_check_affinity(desc, action);

    while (!irq_wait_for_interrupt(action)) {
        irqreturn_t action_ret;

        irq_thread_check_affinity(desc, action);

        action_ret = handler_fn(desc, action);-----------执行中断内核线程函数
        if (action_ret == IRQ_HANDLED)
            atomic_inc(&desc->threads_handled);----------增加threads_handled计数

        wake_threads_waitq(desc);------------------------唤醒wait_for_threads等待队列
    }

    /*
     * This is the regular exit path. __free_irq() is stopping the
     * thread via kthread_stop() after calling
     * synchronize_irq(). So neither IRQTF_RUNTHREAD nor the
     * oneshot mask bit can be set. We cannot verify that as we
     * cannot touch the oneshot mask at this point anymore as
     * __setup_irq() might have given out currents thread_mask
     * again.
     */
    task_work_cancel(current, irq_thread_dtor);
    return 0;
}


static int irq_wait_for_interrupt(struct irqaction *action)
{
    set_current_state(TASK_INTERRUPTIBLE);

    while (!kthread_should_stop()) {

        if (test_and_clear_bit(IRQTF_RUNTHREAD,
                       &action->thread_flags)) {------------判断thread_flags是否设置IRQTF_RUNTHREAD标志位,如果设置则设置当前状态TASK_RUNNING并返回0。此处和__irq_wake_thread中设置IRQTF_RUNTHREAD对应。
            __set_current_state(TASK_RUNNING);
            return 0;
        }
        schedule();-----------------------------------------换出CPU,在此等待睡眠
        set_current_state(TASK_INTERRUPTIBLE);
    }
    __set_current_state(TASK_RUNNING);
    return -1;
}

static irqreturn_t irq_thread_fn(struct irq_desc *desc,
        struct irqaction *action)
{
    irqreturn_t ret;

    ret = action->thread_fn(action->irq, action->dev_id);---执行中断内核线程函数,为request_threaded_irq注册中断参数thread_fn。
    irq_finalize_oneshot(desc, action);---------------------针对oneshot类型中断收尾处理,主要是去屏蔽中断。
    return ret;
}

irq_finalize_oneshot()对ontshot类型的中断进行收尾操作。

static void irq_finalize_oneshot(struct irq_desc *desc,
                 struct irqaction *action)
{
    if (!(desc->istate & IRQS_ONESHOT) ||
        action->handler == irq_forced_secondary_handler)
        return;
again:
    chip_bus_lock(desc);
    raw_spin_lock_irq(&desc->lock);

    /*
     * Implausible though it may be we need to protect us against
     * the following scenario:
     *
     * The thread is faster done than the hard interrupt handler
     * on the other CPU. If we unmask the irq line then the
     * interrupt can come in again and masks the line, leaves due
     * to IRQS_INPROGRESS and the irq line is masked forever.
     *
     * This also serializes the state of shared oneshot handlers
     * versus "desc->threads_onehsot |= action->thread_mask;" in
     * irq_wake_thread(). See the comment there which explains the
     * serialization.
     */
    if (unlikely(irqd_irq_inprogress(&desc->irq_data))) {-----------必须等待硬件中断处理程序清除IRQD_IRQ_INPROGRESS标志位,见handle_irq_event()。因为该标志位表示硬件中断处理程序正在处理硬件中断,直到硬件中断处理完毕才会清除该标志。
        raw_spin_unlock_irq(&desc->lock);
        chip_bus_sync_unlock(desc);
        cpu_relax();
        goto again;
    }

    /*
     * Now check again, whether the thread should run. Otherwise
     * we would clear the threads_oneshot bit of this thread which
     * was just set.
     */
    if (test_bit(IRQTF_RUNTHREAD, &action->thread_flags))
        goto out_unlock;

    desc->threads_oneshot &= ~action->thread_mask;

    if (!desc->threads_oneshot && !irqd_irq_disabled(&desc->irq_data) &&
        irqd_irq_masked(&desc->irq_data))
        unmask_threaded_irq(desc);----------------------------------执行EOI或者去中断屏蔽。

out_unlock:
    raw_spin_unlock_irq(&desc->lock);
    chip_bus_sync_unlock(desc);
}

 

至此一个中断的执行完毕。 

4.4 如何保证IRQS_ONESHOT不嵌套?

  

5. 注册中断

 

5.1 中断、线程、中断线程化

 中断处理程序包括上半部硬件中断处理程序,下半部处理机制,包括软中断、tasklet、workqueue、中断线程化。

当一个外设中断发生后,内核会执行一个函数来响应该中断,这个函数通常被称为中断处理程序或中断服务例程。

上半部硬件中断处理运行在中断上下文中,要求快速完成并且退出中断。

 

中断线程化是实时Linux项目开发的一个新特性,目的是降低中断处理对系统实时延迟的影响。

在LInux内核里,中断具有最高优先级,只要有中断发生,内核会暂停手头的工作转向中断处理,等到所有挂起等待的中断和软终端处理完毕后才会执行进程调度,因此这个过程会造成实时任务得不到及时处理。

中断上下文总是抢占进程上下文,中断上下文不仅是中断处理程序,还包括softirq、tasklet等,中断上下文成了优化Linux实时性的最大挑战之一。

5.2 中断注册接口

 

IRQF_*描述的中断标志位用于request_threaded_irq()申请中断时描述该中断的特性。

IRQS_*的中断标志位是位于struct irq_desc数据结构的istate成员,也即core_internal_state__do_not_mess_with_it

IRQD_*是struct irq_data数据结构中的state_use_accessors成员一组中断标志位,通常用于描述底层中断状态。

关于IRQF_ONESHOT特别解释:必须在硬件中断处理结束之后才能重新使能中断;线程化中断处理过程中保持中断线处于关闭状态,直到该中断线上所有thread_fn执行完毕。

#define IRQF_TRIGGER_NONE    0x00000000
#define IRQF_TRIGGER_RISING    0x00000001---------------------------上升沿触发
#define IRQF_TRIGGER_FALLING    0x00000002--------------------------下降沿触发
#define IRQF_TRIGGER_HIGH    0x00000004-----------------------------高电平触发
#define IRQF_TRIGGER_LOW    0x00000008------------------------------地电平触发
#define IRQF_TRIGGER_MASK    (IRQF_TRIGGER_HIGH | IRQF_TRIGGER_LOW | \
                 IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING)--------四种触发类型
#define IRQF_TRIGGER_PROBE    0x00000010

#define IRQF_SHARED        0x00000080-------------------------------多个设备共享一个中断号
#define IRQF_PROBE_SHARED    0x00000100-----------------------------中断处理程序允许sharing mismatch发生
#define __IRQF_TIMER        0x00000200------------------------------标记一个时钟中断
#define IRQF_PERCPU        0x00000400-------------------------------属于某个特定CPU的中断
#define IRQF_NOBALANCING    0x00000800------------------------------禁止在多CPU之间做中断均衡
#define IRQF_IRQPOLL        0x00001000------------------------------中断被用作轮询
#define IRQF_ONESHOT        0x00002000------------------------------一次性触发中断,不允许嵌套。
#define IRQF_NO_SUSPEND        0x00004000---------------------------在系统睡眠过程中不要关闭该中断
#define IRQF_FORCE_RESUME    0x00008000-----------------------------在系统唤醒过程中必须抢孩子打开该中断
#define IRQF_NO_THREAD        0x00010000----------------------------表示该中断不会给线程化
#define IRQF_EARLY_RESUME    0x00020000
#define IRQF_COND_SUSPEND    0x00040000

#define IRQF_TIMER        (__IRQF_TIMER | IRQF_NO_SUSPEND | IRQF_NO_THREAD)


enum {
    IRQS_AUTODETECT        = 0x00000001,-------------------处于自动侦测状态
    IRQS_SPURIOUS_DISABLED    = 0x00000002,----------------被视为“伪中断”并被禁用
    IRQS_POLL_INPROGRESS    = 0x00000008,------------------正处于轮询调用action
    IRQS_ONESHOT        = 0x00000020,----------------------表示只执行一次,由IRQF_ONESHOT转换而来,在中断线程化执行完成后需要小心对待,见irq_finalize_oneshot()。
    IRQS_REPLAY        = 0x00000040,-----------------------重新发送一次中断
    IRQS_WAITING        = 0x00000080,----------------------处于等待状态
    IRQS_PENDING        = 0x00000200,----------------------该中断被挂起
    IRQS_SUSPENDED        = 0x00000800,--------------------该中断被暂停
};


enum {
    IRQD_TRIGGER_MASK        = 0xf,-------------------------该中断触发类型
    IRQD_SETAFFINITY_PENDING    = (1 <<  8),
    IRQD_NO_BALANCING        = (1 << 10),
    IRQD_PER_CPU            = (1 << 11),
    IRQD_AFFINITY_SET        = (1 << 12),
    IRQD_LEVEL            = (1 << 13),
    IRQD_WAKEUP_STATE        = (1 << 14),
    IRQD_MOVE_PCNTXT        = (1 << 15),
    IRQD_IRQ_DISABLED        = (1 << 16),--------------------该中断处于关闭状态
    IRQD_IRQ_MASKED            = (1 << 17),------------------该中断被屏蔽中
    IRQD_IRQ_INPROGRESS        = (1 << 18),------------------该中断正在被处理中
    IRQD_WAKEUP_ARMED        = (1 << 19),
    IRQD_FORWARDED_TO_VCPU        = (1 << 20),
};

 

 

struct irqaction是每个中断的irqaction描述符。

struct irqaction {
    irq_handler_t        handler;-----------primary handler函数指针
    void            *dev_id;----------------传递给中断处理程序的参数
    void __percpu        *percpu_dev_id;
    struct irqaction    *next;
    irq_handler_t        thread_fn;---------中断线程处理程序的函数指针
    struct task_struct    *thread;----------中断线程的task_struct数据结构
    unsigned int        irq;----------------Linux软件中断号
    unsigned int        flags;--------------注册中断时用的中断标志位,IRQF_*。
    unsigned long        thread_flags;------中断线程相关标志位
    unsigned long        thread_mask;-------在共享中断中,每一个action有一个比特位来表示。
    const char        *name;----------------中断线程名称
    struct proc_dir_entry    *dir;
} ____cacheline_internodealigned_in_smp;

 

request_irq调用request_threaded_irq进行中断注册,只是少了一个thread_fn参数。这也是两则的区别所在,request_irq不能注册线程化中断。

irq:Linux软件中断号,不是硬件中断号。

handler:指primary handler,也即request_irq的中断处理函数handler。

thread_fn:中断线程化的处理函数。

irqflags:中断标志位,见IRQF_*解释。

devname:中断名称。

dev_id:传递给中断处理程序的参数。

handler和thread_fn分别被赋给action->handler和action->thread_fn,组合如下:

  handler thread_fn  
1 先执行handler,然后条件执行thread_fn。
2 × 等同于request_irq()
3 × handler=irq_default_primary_handler
4 × × 返回-EINVAL

很多request_threaded_irq()使用第3种组合,irq_default_primary_handler()返回IRQ_WAKE_THREAD,将工作交给thread_fn进行处理。

第2种组合相当于request_irq()。

第4种组合不被允许,因为中断得不到任何处理。

第1种组合较复杂,在handler根据实际情况返回IRQ_WAKE_THREAD(唤醒内核中断线程)或者IRQ_HANDLED(中断已经处理完毕,不需要唤醒中断内核线程)。

request_threaded_irq()对参数进行检查之后,分配struct irqaction并填充,然后将注册工作交给__setup_irq()。

static inline int __must_check
request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags,
        const char *name, void *dev)
{
    return request_threaded_irq(irq, handler, NULL, flags, name, dev);
}

int request_threaded_irq(unsigned int irq, irq_handler_t handler,
             irq_handler_t thread_fn, unsigned long irqflags,
             const char *devname, void *dev_id)
{
...
    if (((irqflags & IRQF_SHARED) && !dev_id) ||-----------------------------共享中断设备必须传递啊dev_id参数来区分是哪个共享外设的中断
        (!(irqflags & IRQF_SHARED) && (irqflags & IRQF_COND_SUSPEND)) ||
        ((irqflags & IRQF_NO_SUSPEND) && (irqflags & IRQF_COND_SUSPEND)))
        return -EINVAL;

    desc = irq_to_desc(irq);--------------------------------------------------通过Linux中断号找到对应中断描述符struct irq_desc。
    if (!desc)
        return -EINVAL;
...
    if (!handler) {
        if (!thread_fn)
            return -EINVAL;---------------------------------------------------handler和thread_fn不能同时为NULL
        handler = irq_default_primary_handler;--------------------------------没有设置handler,irq_default_primary_handler()默认返回IRQ_WAKE_THREAD。
    }

    action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);-------------------分配struct irqaction,并填充相应成员
    if (!action)
        return -ENOMEM;

    action->handler = handler;
    action->thread_fn = thread_fn;
    action->flags = irqflags;
    action->name = devname;
    action->dev_id = dev_id;

    chip_bus_lock(desc);-------------------------------------------------------调用desc->irq_data.chip->irq_bus_lock()进行加锁保护
    retval = __setup_irq(irq, desc, action);
    chip_bus_sync_unlock(desc);

    if (retval)
        kfree(action);
...
    return retval;
}

 

 

 

5.3 __setup_irq

 

一张图

 __setup_irq()首先做参数检查,然后根据需要创建中断内核线程,这期间处理中断嵌套、oneshot、中断共享等问题。

还设置了中断触发类型设置,中断使能等工作。最后根据需要唤醒中断内核线程,并创建此中断相关sysfs节点。

/*
 * Internal function to register an irqaction - typically used to
 * allocate special interrupts that are part of the architecture.
 */
static int
__setup_irq(unsigned int irq, struct irq_desc *desc, struct irqaction *new)
{
    struct irqaction *old, **old_ptr;
    unsigned long flags, thread_mask = 0;
    int ret, nested, shared = 0;
    cpumask_var_t mask;

    if (!desc)
        return -EINVAL;

    if (desc->irq_data.chip == &no_irq_chip)----------------------表示没有正确初始化中断控制器,对于GICv2在gic_irq_domain_alloc()中指定chip为gic_chipreturn -ENOSYS;
    if (!try_module_get(desc->owner))
        return -ENODEV;

    /*
     * Check whether the interrupt nests into another interrupt
     * thread.
     */
    nested = irq_settings_is_nested_thread(desc);-----------------对于设置了_IRQ_NESTED_THREAD嵌套类型的中断描述符,必须指定thread_fn。
    if (nested) {
        if (!new->thread_fn) {
            ret = -EINVAL;
            goto out_mput;
        }
        /*
         * Replace the primary handler which was provided from
         * the driver for non nested interrupt handling by the
         * dummy function which warns when called.
         */
        new->handler = irq_nested_primary_handler;
    } else {
        if (irq_settings_can_thread(desc))-----------------------判断该中断是否可以被线程化,如果没有设置_IRQ_NOTHREAD表示可以被强制线程化。
            irq_setup_forced_threading(new);
    }

    /*
     * Create a handler thread when a thread function is supplied
     * and the interrupt does not nest into another interrupt
     * thread.
     */
    if (new->thread_fn && !nested) {-----------------------------对不支持嵌套的线程化中断创建一个内核线程,实时SCHED_FIFO,优先级为50的实时线程。
        struct task_struct *t;
        static const struct sched_param param = {
            .sched_priority = MAX_USER_RT_PRIO/2,
        };

        t = kthread_create(irq_thread, new, "irq/%d-%s", irq,
                   new->name);-----------------------------------由irq、中断号、中断名组成的中断线程名,处理函数是irq_thread()。
        if (IS_ERR(t)) {
            ret = PTR_ERR(t);
            goto out_mput;
        }

        sched_setscheduler_nocheck(t, SCHED_FIFO, &param);

        get_task_struct(t);
        new->thread = t;

        set_bit(IRQTF_AFFINITY, &new->thread_flags);
    }

    if (!alloc_cpumask_var(&mask, GFP_KERNEL)) {
        ret = -ENOMEM;
        goto out_thread;
    }

    /*
     * Drivers are often written to work w/o knowledge about the
     * underlying irq chip implementation, so a request for a
     * threaded irq without a primary hard irq context handler
     * requires the ONESHOT flag to be set. Some irq chips like
     * MSI based interrupts are per se one shot safe. Check the
     * chip flags, so we can avoid the unmask dance at the end of
     * the threaded handler for those.
     */
    if (desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)----------表示该中断控制器不支持中断嵌套,所以flags去掉IRQF_ONESHOT。
        new->flags &= ~IRQF_ONESHOT;

    raw_spin_lock_irqsave(&desc->lock, flags);
    old_ptr = &desc->action;
    old = *old_ptr;
    if (old) {-----------------------------------------------------old指向desc->action指向的链表,old不为空说明已经有中断添加到中断描述符irq_desc中,说明这是一个共享中断。shared=1。
...
        /* add new interrupt at end of irq queue */
        do {
            /*
             * Or all existing action->thread_mask bits,
             * so we can find the next zero bit for this
             * new action.
             */
            thread_mask |= old->thread_mask;
            old_ptr = &old->next;
            old = *old_ptr;
        } while (old);
        shared = 1;
    }

    /*
     * Setup the thread mask for this irqaction for ONESHOT. For
     * !ONESHOT irqs the thread mask is 0 so we can avoid a
     * conditional in irq_wake_thread().
     */
    if (new->flags & IRQF_ONESHOT) {
        /*
         * Unlikely to have 32 resp 64 irqs sharing one line,
         * but who knows.
         */
        if (thread_mask == ~0UL) {
            ret = -EBUSY;
            goto out_mask;
        }

        new->thread_mask = 1 << ffz(thread_mask);

    } else if (new->handler == irq_default_primary_handler &&---------非IRQF_ONESHOT类型中断,且handler使用默认irq_default_primary_handler(),如果中断触发类型是LEVEL,如果中断出发后不清中断容易引发中断风暴。提醒驱动开发者,没有primary handler且中断控制器不支持硬件oneshot,必须显式指定IRQF_ONESHOT表示位。
           !(desc->irq_data.chip->flags & IRQCHIP_ONESHOT_SAFE)) {

        pr_err("Threaded irq requested with handler=NULL and !ONESHOT for irq %d\n",
               irq);
        ret = -EINVAL;
        goto out_mask;
    }

    if (!shared) {-------------------------------------------------非共享中断情况
        ret = irq_request_resources(desc);
        if (ret) {
            pr_err("Failed to request resources for %s (irq %d) on irqchip %s\n",
                   new->name, irq, desc->irq_data.chip->name);
            goto out_mask;
        }

        init_waitqueue_head(&desc->wait_for_threads);

        /* Setup the type (level, edge polarity) if configured: */
        if (new->flags & IRQF_TRIGGER_MASK) {
            ret = __irq_set_trigger(desc, irq,-------------------调用gic_chip->irq_set_type设置中断触发类型。
                    new->flags & IRQF_TRIGGER_MASK);

            if (ret)
                goto out_mask;
        }

        desc->istate &= ~(IRQS_AUTODETECT | IRQS_SPURIOUS_DISABLED | \
                  IRQS_ONESHOT | IRQS_WAITING);
        irqd_clear(&desc->irq_data, IRQD_IRQ_INPROGRESS);---------清IRQD_IRQ_INPROGRESS标志位

        if (new->flags & IRQF_PERCPU) {
            irqd_set(&desc->irq_data, IRQD_PER_CPU);
            irq_settings_set_per_cpu(desc);
        }

        if (new->flags & IRQF_ONESHOT)
            desc->istate |= IRQS_ONESHOT;

        if (irq_settings_can_autoenable(desc))
            irq_startup(desc, true);
        else
            /* Undo nested disables: */
            desc->depth = 1;

        /* Exclude IRQ from balancing if requested */
        if (new->flags & IRQF_NOBALANCING) {
            irq_settings_set_no_balancing(desc);
            irqd_set(&desc->irq_data, IRQD_NO_BALANCING);
        }

        /* Set default affinity mask once everything is setup */
        setup_affinity(irq, desc, mask);

    } else if (new->flags & IRQF_TRIGGER_MASK) {
..
    }

    new->irq = irq;
    *old_ptr = new;

    irq_pm_install_action(desc, new);

    /* Reset broken irq detection when installing new handler */
    desc->irq_count = 0;
    desc->irqs_unhandled = 0;

    /*
     * Check whether we disabled the irq via the spurious handler
     * before. Reenable it and give it another chance.
     */
    if (shared && (desc->istate & IRQS_SPURIOUS_DISABLED)) {
        desc->istate &= ~IRQS_SPURIOUS_DISABLED;
        __enable_irq(desc, irq);
    }

    raw_spin_unlock_irqrestore(&desc->lock, flags);

    /*
     * Strictly no need to wake it up, but hung_task complains
     * when no hard interrupt wakes the thread up.
     */
    if (new->thread)
        wake_up_process(new->thread);------------------------------如果该中断被线程化,那么就唤醒该内核线程。这里每个中断对应一个线程。

    register_irq_proc(irq, desc);----------------------------------创建/proc/irq/xxx/目录及其节点。
    new->dir = NULL;
    register_handler_proc(irq, new);-------------------------------以action->name创建目录
    free_cpumask_var(mask);

    return 0;
...
}

 

irq_setup_forced_threading()判断是否强制当前中断线程化,然后对thread_flags置位IRQTF_FORCED_THREAD表示此中断被强制线程化。

将原来的primary handler弄到中断线程中去执行,原来的primary handler换成irq_default_primary_handler。

并设置secondary的primary handler指向irq_forced_secondary_handler(),原来的thread_fn移到secondary的中线程中执行。

static int irq_setup_forced_threading(struct irqaction *new)
{
    if (!force_irqthreads)---------------------------------------------如果内核启动参数包含threadirqs,则支持强制线程化。或者CONFIG_PREEMPT_RT_BASE实时补丁打开,这里也强制线程化。
        return 0;
    if (new->flags & (IRQF_NO_THREAD | IRQF_PERCPU | IRQF_ONESHOT))----和线程化矛盾的标志位。
        return 0;

    new->flags |= IRQF_ONESHOT;----------------------------------------强制线程化的中断都置位IRQF_ONESHOT。

    if (new->handler != irq_default_primary_handler && new->thread_fn) {
        /* Allocate the secondary action */
        new->secondary = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
        if (!new->secondary)
            return -ENOMEM;
        new->secondary->handler = irq_forced_secondary_handler;
        new->secondary->thread_fn = new->thread_fn;
        new->secondary->dev_id = new->dev_id;
        new->secondary->irq = new->irq;
        new->secondary->name = new->name;
    }
    /* Deal with the primary handler */
    set_bit(IRQTF_FORCED_THREAD, &new->thread_flags);
    new->thread_fn = new->handler;
    new->handler = irq_default_primary_handler;
    return 0;
}

 

 

setup_irq()、request_threaded_irq()、request_irq()都是对__setup_irq()的包裹。

request_irq()调用request_threaded_irq(),只是少了thread_fn。

request_thraded_irq()和setup_irq()的区别在于,setup_irq()入参是struct irqaction ,而request_threaded_irq()在内部组装struct irqaction。

 

6. 一个中断的生命

经过上面的分析可以看出一个中断从产生、执行,到最终结束的流程。这里我们用树形代码路径来简要分析一下一个中断的生命周期。

vector_irq()->vector_irq()->__irq_svc()

  ->svc_entry()--------------------------------------------------------------------------保护中断现场

  ->irq_handler()->gic_handle_irq()------------------------------------------------具体到GIC中断控制器对应的就是gic_handle_irq(),此处从架构相关进入了GIC相关处理。

    ->GIC_CPU_INTACK--------------------------------------------------------------读取IAR寄存器,响应中断。

    ->handle_domain_irq()

      ->irq_enter()------------------------------------------------------------------------进入硬中断上下文

      ->generic_handle_irq()

        ->generic_handle_irq_desc()->handle_fasteoi_irq()--------------------根据中断号分辨不同类型的中断,对应不同处理函数,这里中断号取大于等于32。

          ->handle_irq_event()->handle_irq_event_percpu()

            ->action->handler()-----------------------------------------------------------对应到特定中断的处理函数,即上半部

              ->__irq_wake_thread()-----------------------------------------------------如果中断函数处理返回IRQ_WAKE_THREAD,则唤醒中断线程进行处理,但不是立即执行中断线程。

      ->irq_exit()---------------------------------------------------------------------------退出硬中断上下文。视情况处理软中断。

        ->invoke_softirq()-----------------------------------------------------------------处理软中断,超出一定条件任务就会交给软中断线程处理。

    ->GIC_CPU_EOI--------------------------------------------------------------------写EOI寄存器,表示结束中断。至此GIC才会接收新的硬件中断,此前一直是屏蔽硬件中断的。

  ->svc_exit-------------------------------------------------------------------------------恢复中断现场

 从上面的分析可以看出:

  • 中断上半部的处理是关硬件中断的,这里的关硬件中断是GIC就不接收中断处理。直到写EOI之后,GIC仲裁单元才会重新选择中断进行处理。
  • 软中断运行于软中断上下文中,但是仍然是关硬件中断的,这里需要特别注意,软中断需要快速处理并且不能睡眠。
  • 不是所有软中断都运行于软中断上下文中,部分软中断任务可能会交给ksoftirqd线程处理。
  • 包括IRQ_WAKE_THREAD、ksoftirqd、woker等唤醒线程的情况,都不会在中断上下文中进行处理。中断上下文中所做的处理只是唤醒,执行时机交给系统调度。
  • 如果要提高Linux实时性,有两个要点:一是将上半部线程化;另一个是将软中断都交给ksoftirqd线程处理。

 

posted on 2018-05-06 23:00  ArnoldLu  阅读(29896)  评论(1编辑  收藏  举报

导航