linux驱动移植-Nand Flash ONFI标准和MTD子系统【转】
转自:https://www.cnblogs.com/zyly/p/16756273.html#_label0
目录
- 一、ONFI标准
- 二、MTD设备驱动
- 三、MTD设备注册
- 四、mtdblock.c
- 五、mtdchar.c
一、ONFI标准
Nand Flash是嵌入式世界里常见的存储器,对于嵌入式开发而言,Nand Flash主要分为两大类:Serial Nand、Raw Nand,这两类Nand的差异是很大的。
Raw Nand是相对于Serial Nand而言的,Serial Nand即串行接口的Nand Flash,比如采用SPI通信协议的Nand Flash,而Raw Nand是并行接口的Nand Flash。
这里我们首先介绍ONFI协议,主要是因为在Nand Flash驱动源码分析的时候涉及到ONFI协议。而我们使用的K9F2G08U0C这款芯片并没有支持ONFI协议,我们将该芯片支持的命令和ONFI 1.0规定的命令对比就可以发现。
1.1 ONFI标准
说到Raw Nand发展史,其实早期的Raw Nand没有统一标准,虽然早在1989年Toshiba便发表了Nand Flash结构,但具体到Raw Nand芯片,各厂商都是自由设计,因此尺寸不统一、存储结构差异大、接口命令不通用等问题导致客户使用起来很难受。
为了改变这一现状,2006年几个主流的Raw Nand厂商(Hynix、Intel、Micron、Phison、Sony、ST)联合起来商量制订一个Raw Nand标准,这个标准叫Open Nand Flash Interface,简称ONFI,2006年12月ONFI 1.0标准正式推出,此后几乎所有的Raw Nand厂商都按照ONFI标准设计生产Raw Nand,从此不管哪家生产的Raw Nand对嵌入式设计者来说几乎都是一样的,至少在驱动代码层面是一样的。
ONFI官网:http://www.onfi.org/,在这里我们下载到ONFI协议规范:

1.2 Raw Nand分类
1.2.1 单元层数
Nand Flash内存单元按照层数可以分为:
- 单层单元(Single Level Cell,简称SLC):这种类型的闪存在读写数据时具有最为精确,并且还具有持续最长的数据读写寿命的优点。SLC擦写寿命约在9万到10万次之间。这种类型的闪存由于其使用寿命,准确性和综合性能,在企业市场上十分受众。但由于储存成本高、存储容量相对较小,在家用市场则不太受青睐。
- 多层单元(Multi Level Cell,简称MLC):它的命名来源于它在SLC的1位/单元的基础上,变成了2位/单元。这样做的一大优势在于大大降低了大容量储存闪存的成本,约3000--10000次擦写寿命。
- 三层单元(Triple Level Cell,简称TLC):TLC闪存是闪存生产中最低廉的规格,其储存达到了3位/单元,虽然高储存密度实现了较廉价的大容量格式,但其读写的生命周期被极大地缩短,擦写寿命只有短短的500~1000次,同时读写速度较差,只适合普通消费者使用,不能达到工业使用的标准。
- 四层单元(Quad Lebel Cell,简称QLC):QLC每个单元可储存4bit数据,跟TLC相比,QLC的储存密度提高了33%。QLC不仅能经受1000次编程或擦写循环(与TLC相当,甚至更好),而且容量提升了,成本也更低。
结论:SLC>MLC>TLC。
目前大多数U盘都是采用TLC芯片颗粒,其优点是价格便宜,不过速度一般,寿命相对较短。
而SSD固态硬盘中,目前MLC颗粒固态硬盘是主流,其价格适中,速度与寿命相对较好,而低价SSD固态硬盘普遍采用的是TLC芯片颗粒,大家在购买固态硬盘的时候,可以在产品参数中去了解。
SLC颗粒固态目前主要在一些高端固态硬盘中出现,售价多数上千元,甚至更贵。
智能手机方面,目前多数智能手机存储也是采用TLC芯片存储,而苹果iPhone6部分产品采用的TLC芯片,另外还有部分采用的是MLC芯片颗粒。总的来说,MLC闪存芯片颗粒是时下主流,产品在速度、寿命以及价格上适中,比较适合推荐。
1.2.2 数据线宽度
数据线宽度可以分为x8 、x16。
1.2.3 数据采集模式
数据采集模式可以分为 SDR、DDR。
1.2.4 接口命令标准
接口命令标准可以分为:非标、ONFI。
1.3 Raw Nand内存模型
ONFI规定了Raw Nand内存单元从大到小最多分为:Device、LUN(Die、Target)、Plane、Block、Page、Cell。
- Device:就是指单片Nand Flash,对外提供Package封装的芯片,1个Device包含1个或者多个LUN;
- LUN(Die、Target):是接收和执行Flash命令的基本单元,1个LUN包含1个或者多个plane。
- Plane:1个Plane包含多个Block。
- Block:能够执行擦除操作的最小单元,通常由多个Page组成。
- Page:能够执行编程和读操作的最小单元,通常大小为2KB等。
- Cell:Page中的最小操作擦写读单元,对应一个浮栅晶体管,可以存储1bit或多bit。
其中Page和Block是必有的,因为Page是读写的最小单元,Block是擦除的最小单元。而LUN和Plane则不是必有的(如没有,可认为LUN=1, Plane=1),一般在大容量Raw Nand(至少8Gb以上)上才会出现。
常见的Nand Flash内部只有一个chip(LUN)、每个chip只有1个plane,而有些复杂得,容量更大的Nand Flash,内部有多个chip,每个chip有多个plane。这类的Nand Flash,其实就是多了一个主控将多块Flash叠加在一起,如下图:

注:对于chip的概念,我理解就是上面的LUN,其实任何某个型号的Nand Flash,都可以称其是一个chip,但是实际上,这里我们所提到的,是针对内部来说的,也就是某型号的Nand Flash,内部有几个chip,比如:
- 三星的2GB的K9WAG08U1A芯片(可以理解为外部芯片/型号)内部装了2个单片是1GB的K9K8G08U0A,此时就称K9WAG08U1A内部有2个chip;
- 而有些单个的chip,内部又包含多个plane,比如上面的K9K8G08U0A内部包含4个单片是2Gb的Plane;
1.4 Raw Nand信号与封装
ONFI规定了Raw Nand信号线与封装,如下是典型的x8 Raw Nand内部结构图:

除了内存单元外,还有两大组成,分别是IO控制单元和逻辑控制单元,信号线主要挂在IO控制与逻辑单元,x8 Raw Nand主要有15根信号线(其中必须的是13根,CE¯¯¯¯¯¯¯¯CE¯和RB¯¯¯¯RB¯可以不用)。
| 引脚名称 | 描述 |
| CLE | 命令使能,当CLE为高电平时,WE¯¯¯¯¯¯¯¯¯WE¯ 上升沿锁存I/O输入到命令寄存器 |
| ALE | 地址使能,当ALE为高电平时,WE¯¯¯¯¯¯¯¯¯WE¯上升沿锁存I/O输入到地址寄存器 |
| CE¯¯¯¯¯¯¯¯CE¯ | 片选信号,低电位有效 |
| RE¯¯¯¯¯¯¯¯RE¯ | 读使能,低电位有效 |
| WE¯¯¯¯¯¯¯¯¯WE¯ | WE¯¯¯¯¯¯¯¯¯WE¯上升沿锁存I/O输入到命令、地址、数据寄存器 |
| WP¯¯¯¯¯¯¯¯¯WP¯ | 写保护 |
| RB¯¯¯¯RB¯ | 就绪/忙输出信号(低电平表示操作还在进行中,高电平表示操作完成) |
| VCC | 电源 |
| VSS | 地 |
| NC | 不接 |
| I/O0 ~ I/O7 | 数据输入输出(命令、地址、数据公用数据总线) |
ONFI规定的封装标准有很多,比如TSOP48、LGA52、BGA63/100/132/152/272/316,其中对于嵌入式开发而言,最常用的是如下图扁平封装的TSOP-48,这种封装常用于容量较小的Raw Nand(1/2/4/8/16/32Gb),1-32Gb容量对于嵌入式设计而言差不多够用,且TSOP-48封装易于PCB设计,因此得以流行。

1.5 Raw Nand接口命令
ONFI 1.0规定了Raw Nand接口命令,如下表所示,其中一部分是必须要支持的(M),还有一部分是可选支持的(O)。必须支持的命令里最常用的是Read(Read Page)、Page Program、Block Erase、Read Status这三条,涵盖读写擦最基本的三种操作。

此外比较重要的还有:
- Read Status,用于获取命令执行状态与结果。
- Read Parameter Page:用于获取芯片内部存储的出厂信息(包括内存结构、特性、时序、其他行为参数等),其结构已由ONFI规定如下表,在设计Nand软件驱动时,可以通过获取这个Parameter Page来做到代码通用。


二、MTD设备驱动
MTD(Memory Technology Drivers)是用于访问memory设备( ROM 、 Flash)的Linux 的子系统, MTD 的主要目的是为了使新的memory设备的驱动更加简单,为此它在硬件和上层之间提供了一个抽象的接口。
2.1 MTD子系统概要
在介绍MTD之前,我们思考一个问题,linux内核为什么抽象出了MTD子系统呢?
我们回顾一下我们上一节块设备驱动编写的流程:
- 调用register_blkdev注册块设备主设备号;
- 使用alloc_disk申请一个通用磁盘对象gendisk;
- 使用blk_mq_init_sq_queue初始化一个请求队列;
- 设置成员参数major、first_minor、disk_name、fops;
- 设置请求队列queue,等于之前初始化的请求队列;
- 设置gendisk结构体的成员;
- 使用add_disk注册gendisk;
针对于每一种型号的Flash设备,我们进行块设备驱动编写的时候,都要重复进行如上的操作。那我们就开始想了,各种型号的Flash设备有什么区别呢?以Nand Flash为例,主要就是内存模型(页大小、块大小、页数/块、OOB等)、以及时序参数略有差别,那我们是否可以将与Nand Flash紧密相关的部分抽离出来,由Nand Flash驱动层提供,而其他相同部分单独抽离出来。MTD子系统就是做了这样的事情。
2.2 MTD子系统框架
如上图所示,MTD程序框架通用可以分为四层,从上到下以此为设备节点、MTD设备层、MTD原始设备层,Flash驱动层。
- 设备节点:通过mknod在/dev子目录下建立MTD块设备节点(主设备号为31)和MTD字符设备节点(主设备号为90),通过访问此设备节点即可访问MTD字符设备和块设备 。
- MTD设备层:基于MTD原始设备,linux系统可以定义出MTD的块设备(主设备号31)和字符设备(设备号90)。其中:
- mtdchar.c:MTD字符设备接口相关实现;
- mtdblock.c:MTD块设备接口相关实现;这部分负责设备的建立、数据的读写、优化处理等。这跟传统的块设备驱动类型,块设备主设备号的申请,gendisk结构体的分配设置、队列的初始化等,这些都是由内核自动完成。
- MTD原始设备层:用于描述MTD原始设备的数据结构是mtd_info,它定义了大量的关于MTD的数据和操作函数。其中:
- mtdcore.c: MTD原始设备接口相关实现;
- mtdpart.c : MTD分区接口相关实现;
- Flash驱动层:Flash驱动层负责对Flash硬件的读、写和擦除操作,Nand Flash和Nor Flash有不同的协议和硬件细节,这部分知道发什么,如发送什么命令可以识别、读写、擦除等操作,以及硬件该怎么发。Nand Flash有Nand的协议,Nor Flash有Nor的协议,不同协议有不同的函数,通过对应的结构体和函数构造对应的操作环境。用户只需要完成Flash驱动层的相关结构体的分配、设置、注册,并建立从具体设备到MTD原始设备映射关系。
- Nand Flash芯片的驱动位于drivers/mtd/nand/子目录下,Nand Flash使用nand_chip结构体;
- Nor Flash芯片驱动位于drivers/mtd/chips/子目录下,Nor Flash使用map_info结构体;
2.2.1 Flash驱动层
(1) Nor Flash驱动
linux内核实现了针对CFI、JEDEC等接口标准的通用Nor Flash驱动。在上述接口驱动基础上,芯片级驱动较简单 :定义具体内存映射结构体map_info,然后通过接口类型后调用do_map_probe。
以scb2_flash.c(位于drivers/mtd/maps/)为例:
- 定义map_info结构体,初始化成员name、size、phys、bankwidth;
- 通过ioremap映射成员virt(虚拟内存地址);
- 通过函数simple_map_init初始化map_info成员函数read、write、copy_from、copy_to;
- 通过do_map_probe进行CFI接口探测,返回mtd_info结构体;
- 通过parse_mtd_partitions、add_mtd_partitions注册MTD原始设备;
(2) Nand Flash驱动
linux内核实现了通用Nand Flash驱动(drivers/mtd/nand/raw/nand_base.c),芯片级驱动需要实现nand_chip结构。
MTD使用nand_chip来表示一个Nand Flash芯片, 该结构体包含了关于Nand Flash的内存模型信息,读写方法,ECC模式,硬件控制等一系列底层机制。
以s3c2410.c(位于drivers/mtd/nand/raw)为例:
-
分配nand_chip内存;
-
根据SOC Nand控制器初始化nand_chip成员,比如:chip->legacy(成员write_buf、read_buf、select_chip、cmd_ctrl、dev_ready、IO_ADDR_R、IO_ADDR_W)、chip->controller;
- 设置chip->priv为mtd_info;
-
以mtd_info为参数调用nand_scan()探测Nand Flash,nand_scan()会读取nand芯片ID:
- 初始化chip->base.mtd(成员writesize、oobsize、erasesize等);
- 初始化chip->base.memorg(成员bits_per_cell、pagesize、oobsize、pages_per_eraseblock、planes_per_lun、luns_per_target、ntatgets等);
- 初始化chip->options、chip->base.eccreq;
- 初始化chip->ecc各个成员(设置ecc模式及处理函数);
- chip成员中所有未初始化函数指针则使用nand_base.c中的默认函数;
-
mtd_info和mtd_partition为参数调用mtd_device_register()进行MTD设备注册;
2.3 核心结构体
2.3.1 struct mtd_info
linux内核使用mtd_info结构体表示MTD原始设备,描述一个设备或一个多分区设备中的一个分区,这其中定义了大量关于MTD的数据和操作函数;所有mtd_info结构体都被存放在mtd_info数组mtd_table中。
mtd_info定义在include/linux/mtd/mtd.h:
struct mtd_info {
u_char type; // MTD设备类型 包括MTD_NORFALSH、MTD_NANDFALSH等
uint32_t flags; // 标志 MTD_WRITEABLE、MTD_NO_ERASE等
uint32_t orig_flags; /* Flags as before running mtd checks */
uint64_t size; // Total size of the MTD MTD设备总容量
/* "Major" erase size for the device. Naïve users may take this
* to be the only erase size available, or may use the more detailed
* information below if they desire
*/
uint32_t erasesize; // MTD设备擦除单位大小,对于Nand Flash来说就是Block的大小
/* Minimal writable flash unit size. In case of NOR flash it is 1 (even
* though individual bits can be cleared), in case of NAND flash it is
* one NAND page (or half, or one-fourths of it), in case of ECC-ed NOR
* it is of ECC block size, etc. It is illegal to have writesize = 0.
* Any driver registering a struct mtd_info must ensure a writesize of
* 1 or larger.
*/
uint32_t writesize; // 可写入数据最小字节数,对于Nor Flash是字节,对于Nand Flash为一页
/*
* Size of the write buffer used by the MTD. MTD devices having a write
* buffer can write multiple writesize chunks at a time. E.g. while
* writing 4 * writesize bytes to a device with 2 * writesize bytes
* buffer the MTD driver can (but doesn't have to) do 2 writesize
* operations, but not 4. Currently, all NANDs have writebufsize
* equivalent to writesize (NAND page size). Some NOR flashes do have
* writebufsize greater than writesize.
uint32_t writebufsize;
uint32_t oobsize; // Amount of OOB data per block (e.g. 16)
uint32_t oobavail; // Available OOB bytes per block
/*
* If erasesize is a power of 2 then the shift is stored in
* erasesize_shift otherwise erasesize_shift is zero. Ditto writesize.
*/
unsigned int erasesize_shift; // 擦除数据偏移值,根据erasesize计算
unsigned int writesize_shift; // 写入数据偏移值,根据writesize计算
/* Masks based on erasesize_shift and writesize_shift */
unsigned int erasesize_mask; // 擦除数据大小掩码,根据erasesize_shift计算
unsigned int writesize_mask; // 写入数据大小掩码,根据writesize_shift计算
/*
* read ops return -EUCLEAN if max number of bitflips corrected on any
* one region comprising an ecc step equals or exceeds this value.
* Settable by driver, else defaults to ecc_strength. User can override
* in sysfs. N.B. The meaning of the -EUCLEAN return code has changed;
* see Documentation/ABI/testing/sysfs-class-mtd for more detail.
*/
unsigned int bitflip_threshold;
/* Kernel-only stuff starts here. */
const char *name; // MTD设备名称
int index; // 索引值
/* OOB layout description */
const struct mtd_ooblayout_ops *ooblayout; // oob布局描述
/* NAND pairing scheme, only provided for MLC/TLC NANDs */
const struct mtd_pairing_scheme *pairing;
/* the ecc step size. */
unsigned int ecc_step_size;
/* max number of correctible bit errors per ecc step */
unsigned int ecc_strength;
/* Data for variable erase regions. If numeraseregions is zero,
* it means that the whole device has erasesize as given above.
*/
int numeraseregions; // 可变擦除区域的数目,通常为1
struct mtd_erase_region_info *eraseregions; // 可变擦除区域
/*
* Do not call via these pointers, use corresponding mtd_*()
* wrappers instead.
*/
int (*_erase) (struct mtd_info *mtd, struct erase_info *instr); // 擦除
int (*_point) (struct mtd_info *mtd, loff_t from, size_t len,
size_t *retlen, void **virt, resource_size_t *phys);
int (*_unpoint) (struct mtd_info *mtd, loff_t from, size_t len);
int (*_read) (struct mtd_info *mtd, loff_t from, size_t len, // 读取
size_t *retlen, u_char *buf);
int (*_write) (struct mtd_info *mtd, loff_t to, size_t len, // 写入
size_t *retlen, const u_char *buf);
int (*_panic_write) (struct mtd_info *mtd, loff_t to, size_t len,
size_t *retlen, const u_char *buf);
int (*_read_oob) (struct mtd_info *mtd, loff_t from,
struct mtd_oob_ops *ops);
int (*_write_oob) (struct mtd_info *mtd, loff_t to,
struct mtd_oob_ops *ops);
int (*_get_fact_prot_info) (struct mtd_info *mtd, size_t len,
size_t *retlen, struct otp_info *buf);
int (*_read_fact_prot_reg) (struct mtd_info *mtd, loff_t from,
size_t len, size_t *retlen, u_char *buf);
int (*_get_user_prot_info) (struct mtd_info *mtd, size_t len,
size_t *retlen, struct otp_info *buf);
int (*_read_user_prot_reg) (struct mtd_info *mtd, loff_t from,
size_t len, size_t *retlen, u_char *buf);
int (*_write_user_prot_reg) (struct mtd_info *mtd, loff_t to,
size_t len, size_t *retlen, u_char *buf);
int (*_lock_user_prot_reg) (struct mtd_info *mtd, loff_t from,
size_t len);
int (*_writev) (struct mtd_info *mtd, const struct kvec *vecs,
unsigned long count, loff_t to, size_t *retlen);
void (*_sync) (struct mtd_info *mtd);
int (*_lock) (struct mtd_info *mtd, loff_t ofs, uint64_t len);
int (*_unlock) (struct mtd_info *mtd, loff_t ofs, uint64_t len);
int (*_is_locked) (struct mtd_info *mtd, loff_t ofs, uint64_t len);
int (*_block_isreserved) (struct mtd_info *mtd, loff_t ofs);
int (*_block_isbad) (struct mtd_info *mtd, loff_t ofs);
int (*_block_markbad) (struct mtd_info *mtd, loff_t ofs);
int (*_max_bad_blocks) (struct mtd_info *mtd, loff_t ofs, size_t len);
int (*_suspend) (struct mtd_info *mtd);
void (*_resume) (struct mtd_info *mtd);
void (*_reboot) (struct mtd_info *mtd);
/*
* If the driver is something smart, like UBI, it may need to maintain
* its own reference counting. The below functions are only for driver.
*/
int (*_get_device) (struct mtd_info *mtd);
void (*_put_device) (struct mtd_info *mtd);
struct notifier_block reboot_notifier; /* default mode before reboot */
/* ECC status information */
struct mtd_ecc_stats ecc_stats;
/* Subpage shift (NAND) */
int subpage_sft;
void *priv;
struct module *owner;
struct device dev;
int usecount;
struct mtd_debug_info dbg;
struct nvmem_device *nvmem;
};
mtd_info结构体中的read()、write()、read_oob()、write_oob()、erase()是MTD设备驱动要实现的主要函数,这是MTD原始设备与Flash驱动层之间的接口;linux已经已经帮我们实现了一套适合大部分Flash设备的mtd_info成员函数。
2.3.2 mtd_part
在MTD中使用mtd_part来表示分区,其中包含了mtd_info,每一个分区都是被看做一个MTD原始设备,在mtd_table中,mtd_part.mtd_info中的大部分数据都从该分区的主分区mtd_part->master中获得。master不作为一个MTD原始设备加入mtd_table中。
mtd_part定义在drivers/mtd/mtdpart.c:
/**
* struct mtd_part - our partition node structure
*
* @mtd: struct holding partition details
* @parent: parent mtd - flash device or another partition
* @offset: partition offset relative to the *flash device*
*/
struct mtd_part {
struct mtd_info mtd; // 分区信息
struct mtd_info *parent; // 分区的主分区
uint64_t offset; // 分区的偏移地址
struct list_head list; // 双向链表,将mtd_part链接成一个链表
};
2.3.3 struct mtd_partition
在MTD中用mtd_partition来表示分区的信息,mtd_partition定义在include/linux/mtd/partitions.h:
/*
* Partition definition structure:
*
* An array of struct partition is passed along with a MTD object to
* mtd_device_register() to create them.
*
* For each partition, these fields are available:
* name: string that will be used to label the partition's MTD device.
* types: some partitions can be containers using specific format to describe
* embedded subpartitions / volumes. E.g. many home routers use "firmware"
* partition that contains at least kernel and rootfs. In such case an
* extra parser is needed that will detect these dynamic partitions and
* report them to the MTD subsystem. If set this property stores an array
* of parser names to use when looking for subpartitions.
* size: the partition size; if defined as MTDPART_SIZ_FULL, the partition
* will extend to the end of the master MTD device.
* offset: absolute starting position within the master MTD device; if
* defined as MTDPART_OFS_APPEND, the partition will start where the
* previous one ended; if MTDPART_OFS_NXTBLK, at the next erase block;
* if MTDPART_OFS_RETAIN, consume as much as possible, leaving size
* after the end of partition.
* mask_flags: contains flags that have to be masked (removed) from the
* master MTD flag set for the corresponding MTD partition.
* For example, to force a read-only partition, simply adding
* MTD_WRITEABLE to the mask_flags will do the trick.
*
* Note: writeable partitions require their size and offset be
* erasesize aligned (e.g. use MTDPART_OFS_NEXTBLK).
*/
struct mtd_partition {
const char *name; /* identifier string 分区名 */
const char *const *types; /* names of parsers to use if any */
uint64_t size; /* partition size 分区大小 */
uint64_t offset; /* offset within the master MTD space 分区的偏移值 */
uint32_t mask_flags; /* master MTD flags to mask out for this partition 标志掩码 */
struct device_node *of_node;
};
2.3.4 struct nand_chip
nand_chip是一个比较重要的数据结构,MTD使用nand_chip来表示一个Nand Flash内部的芯片,该结构体包含了关于Nand Flash的内存模型信息,读写方法,ECC模式,硬件控制等一系列底层机制。其定义在include/linux/mtd/rawnand.h:
/**
* struct nand_chip - NAND Private Flash Chip Data
* @base: Inherit from the generic NAND device
* @legacy: All legacy fields/hooks. If you develop a new driver,
* don't even try to use any of these fields/hooks, and if
* you're modifying an existing driver that is using those
* fields/hooks, you should consider reworking the driver
* avoid using them.
* @setup_read_retry: [FLASHSPECIFIC] flash (vendor) specific function for
* setting the read-retry mode. Mostly needed for MLC NAND.
* @ecc: [BOARDSPECIFIC] ECC control structure
* @buf_align: minimum buffer alignment required by a platform
* @oob_poi: "poison value buffer," used for laying out OOB data
* before writing
* @page_shift: [INTERN] number of address bits in a page (column
* address bits).
* @phys_erase_shift: [INTERN] number of address bits in a physical eraseblock
* @bbt_erase_shift: [INTERN] number of address bits in a bbt entry
* @chip_shift: [INTERN] number of address bits in one chip
* @options: [BOARDSPECIFIC] various chip options. They can partly
* be set to inform nand_scan about special functionality.
* See the defines for further explanation.
* @bbt_options: [INTERN] bad block specific options. All options used
* here must come from bbm.h. By default, these options
* will be copied to the appropriate nand_bbt_descr's.
* @badblockpos: [INTERN] position of the bad block marker in the oob
* area.
* @badblockbits: [INTERN] minimum number of set bits in a good block's
* bad block marker position; i.e., BBM == 11110111b is
* not bad when badblockbits == 7
* @onfi_timing_mode_default: [INTERN] default ONFI timing mode. This field is
* set to the actually used ONFI mode if the chip is
* ONFI compliant or deduced from the datasheet if
* the NAND chip is not ONFI compliant.
* @pagemask: [INTERN] page number mask = number of (pages / chip) - 1
* @data_buf: [INTERN] buffer for data, size is (page size + oobsize).
* @pagecache: Structure containing page cache related fields
* @pagecache.bitflips: Number of bitflips of the cached page
* @pagecache.page: Page number currently in the cache. -1 means no page is
* currently cached
* @subpagesize: [INTERN] holds the subpagesize
* @id: [INTERN] holds NAND ID
* @parameters: [INTERN] holds generic parameters under an easily
* readable form.
* @data_interface: [INTERN] NAND interface timing information
* @cur_cs: currently selected target. -1 means no target selected,
* otherwise we should always have cur_cs >= 0 &&
* cur_cs < nanddev_ntargets(). NAND Controller drivers
* should not modify this value, but they're allowed to
* read it.
* @read_retries: [INTERN] the number of read retry modes supported
* @lock: lock protecting the suspended field. Also used to
* serialize accesses to the NAND device.
* @suspended: set to 1 when the device is suspended, 0 when it's not.
* @bbt: [INTERN] bad block table pointer
* @bbt_td: [REPLACEABLE] bad block table descriptor for flash
* lookup.
* @bbt_md: [REPLACEABLE] bad block table mirror descriptor
* @badblock_pattern: [REPLACEABLE] bad block scan pattern used for initial
* bad block scan.
* @controller: [REPLACEABLE] a pointer to a hardware controller
* structure which is shared among multiple independent
* devices.
* @priv: [OPTIONAL] pointer to private chip data
* @manufacturer: [INTERN] Contains manufacturer information
* @manufacturer.desc: [INTERN] Contains manufacturer's description
* @manufacturer.priv: [INTERN] Contains manufacturer private information
*/
struct nand_chip {
struct nand_device base; // 可以看作mtd_info子类
struct nand_legacy legacy; // 硬件操作函数
int (*setup_read_retry)(struct nand_chip *chip, int retry_mode);
unsigned int options; // 与具体的nand芯片相关的一些选项,如NAND_BUSWIDTH_16等
unsigned int bbt_options;
int page_shift; // 用来表示nand芯片的page大小,如某nand芯片的一个page有512个字节,那么该值就是9
int phys_erase_shift; // 用来表示nand芯片每次可擦除的大小,如某nand芯片每次可擦除16kb(通常为一个block大小),那么该值就是14
int bbt_erase_shift; // 用来表示bad block table的大小,通常bbt占用一个block,所以该值通常和phys_erase_shift相同
int chip_shift; // 使用位表示nand芯片的容量
int pagemask; // nand总容量/每页字节数 - 1 得到页掩码
u8 *data_buf;
struct {
unsigned int bitflips;
int page;
} pagecache;
int subpagesize;
int onfi_timing_mode_default;
unsigned int badblockpos;
int badblockbits;
struct nand_id id; // 保存从nand读取到的设备id信息,包含厂家ID、设备ID等
struct nand_parameters parameters;
struct nand_data_interface data_interface;
int cur_cs; // 当前选中的目标
int read_retries;
struct mutex lock;
unsigned int suspended : 1;
uint8_t *oob_poi;
struct nand_controller *controller; // nand controller
struct nand_ecc_ctrl ecc; // ecc校验结构体,里面有大量函数进行ecc校验
unsigned long buf_align;
uint8_t *bbt;
struct nand_bbt_descr *bbt_td;
struct nand_bbt_descr *bbt_md;
struct nand_bbt_descr *badblock_pattern;
void *priv;
struct {
const struct nand_manufacturer *desc;
void *priv;
} manufacturer; // 厂家ID信息
};
nand_chip中的ecc主要做一些与ecc有关的操作,如read_page_raw、write_pager_raw,里面含有大量函数进行ecc校验。
nand_chip中的legacy中读写函数,如read_buf、cmdfunc等,与具体的Nand Controller相关,这部分函数与硬件交互,通常需要我们自己根据SOC Nand Controller来实现。
2.3.5 struct nand_legacy
nand_legacy该结构体就是保存与SOC Nand Controller硬件相关的函数:
/**
* struct nand_legacy - NAND chip legacy fields/hooks
* @IO_ADDR_R: address to read the 8 I/O lines of the flash device
* @IO_ADDR_W: address to write the 8 I/O lines of the flash device
* @select_chip: select/deselect a specific target/die
* @read_byte: read one byte from the chip
* @write_byte: write a single byte to the chip on the low 8 I/O lines
* @write_buf: write data from the buffer to the chip
* @read_buf: read data from the chip into the buffer
* @cmd_ctrl: hardware specific function for controlling ALE/CLE/nCE. Also used
* to write command and address
* @cmdfunc: hardware specific function for writing commands to the chip.
* @dev_ready: hardware specific function for accessing device ready/busy line.
* If set to NULL no access to ready/busy is available and the
* ready/busy information is read from the chip status register.
* @waitfunc: hardware specific function for wait on ready.
* @block_bad: check if a block is bad, using OOB markers
* @block_markbad: mark a block bad
* @set_features: set the NAND chip features
* @get_features: get the NAND chip features
* @chip_delay: chip dependent delay for transferring data from array to read
* regs (tR).
* @dummy_controller: dummy controller implementation for drivers that can
* only control a single chip
*
* If you look at this structure you're already wrong. These fields/hooks are
* all deprecated.
*/
struct nand_legacy {
void __iomem *IO_ADDR_R; // 读8根I/O线地址 比如S3C2440设置为数据寄存器地址 NFDATA
void __iomem *IO_ADDR_W; // 写8根I/O线地址 比如S3C2440设置为数据寄存器地址 NFDATA
void (*select_chip)(struct nand_chip *chip, int cs); // 片选/取消片选
u8 (*read_byte)(struct nand_chip *chip); // 读取一个字节数据
void (*write_byte)(struct nand_chip *chip, u8 byte); // 写入一个字节数据
void (*write_buf)(struct nand_chip *chip, const u8 *buf, int len); // 写入len个长度字节
void (*read_buf)(struct nand_chip *chip, u8 *buf, int len); // 读取len个长度字节
void (*cmd_ctrl)(struct nand_chip *chip, int dat, unsigned int ctrl); // 硬件相关控制函数 写命令/地址
void (*cmdfunc)(struct nand_chip *chip, unsigned command, int column, // 发送写数据命令 传入列地址、页地址
int page_addr);
int (*dev_ready)(struct nand_chip *chip); // 获取nand状态 繁忙/就绪
int (*waitfunc)(struct nand_chip *chip); // 等待nand就绪
int (*block_bad)(struct nand_chip *chip, loff_t ofs); // 检测是否有坏块
int (*block_markbad)(struct nand_chip *chip, loff_t ofs); // 标记坏块
int (*set_features)(struct nand_chip *chip, int feature_addr,
u8 *subfeature_para);
int (*get_features)(struct nand_chip *chip, int feature_addr,
u8 *subfeature_para);
int chip_delay; // 延迟时间
struct nand_controller dummy_controller;
};
2.3.6 struct nand_ecc_ctrl
nand_ecc_ctrl中的读写函数read_page_raw、write_pager_raw等主要是用来做一些与ecc有关的操作:
/**
* struct nand_ecc_ctrl - Control structure for ECC
* @mode: ECC mode
* @algo: ECC algorithm
* @steps: number of ECC steps per page
* @size: data bytes per ECC step
* @bytes: ECC bytes per step
* @strength: max number of correctible bits per ECC step
* @total: total number of ECC bytes per page
* @prepad: padding information for syndrome based ECC generators
* @postpad: padding information for syndrome based ECC generators
* @options: ECC specific options (see NAND_ECC_XXX flags defined above)
* @priv: pointer to private ECC control data
* @calc_buf: buffer for calculated ECC, size is oobsize.
* @code_buf: buffer for ECC read from flash, size is oobsize.
* @hwctl: function to control hardware ECC generator. Must only
* be provided if an hardware ECC is available
* @calculate: function for ECC calculation or readback from ECC hardware
* @correct: function for ECC correction, matching to ECC generator (sw/hw).
* Should return a positive number representing the number of
* corrected bitflips, -EBADMSG if the number of bitflips exceed
* ECC strength, or any other error code if the error is not
* directly related to correction.
* If -EBADMSG is returned the input buffers should be left
* untouched.
* @read_page_raw: function to read a raw page without ECC. This function
* should hide the specific layout used by the ECC
* controller and always return contiguous in-band and
* out-of-band data even if they're not stored
* contiguously on the NAND chip (e.g.
* NAND_ECC_HW_SYNDROME interleaves in-band and
* out-of-band data).
* @write_page_raw: function to write a raw page without ECC. This function
* should hide the specific layout used by the ECC
* controller and consider the passed data as contiguous
* in-band and out-of-band data. ECC controller is
* responsible for doing the appropriate transformations
* to adapt to its specific layout (e.g.
* NAND_ECC_HW_SYNDROME interleaves in-band and
* out-of-band data).
* @read_page: function to read a page according to the ECC generator
* requirements; returns maximum number of bitflips corrected in
* any single ECC step, -EIO hw error
* @read_subpage: function to read parts of the page covered by ECC;
* returns same as read_page()
* @write_subpage: function to write parts of the page covered by ECC.
* @write_page: function to write a page according to the ECC generator
* requirements.
* @write_oob_raw: function to write chip OOB data without ECC
* @read_oob_raw: function to read chip OOB data without ECC
* @read_oob: function to read chip OOB data
* @write_oob: function to write chip OOB data
*/
struct nand_ecc_ctrl {
nand_ecc_modes_t mode;
enum nand_ecc_algo algo;
int steps;
int size;
int bytes;
int total;
int strength;
int prepad;
int postpad;
unsigned int options;
void *priv;
u8 *calc_buf;
u8 *code_buf;
void (*hwctl)(struct nand_chip *chip, int mode);
int (*calculate)(struct nand_chip *chip, const uint8_t *dat,
uint8_t *ecc_code);
int (*correct)(struct nand_chip *chip, uint8_t *dat, uint8_t *read_ecc,
uint8_t *calc_ecc);
int (*read_page_raw)(struct nand_chip *chip, uint8_t *buf,
int oob_required, int page);
int (*write_page_raw)(struct nand_chip *chip, const uint8_t *buf,
int oob_required, int page);
int (*read_page)(struct nand_chip *chip, uint8_t *buf,
int oob_required, int page);
int (*read_subpage)(struct nand_chip *chip, uint32_t offs,
uint32_t len, uint8_t *buf, int page);
int (*write_subpage)(struct nand_chip *chip, uint32_t offset,
uint32_t data_len, const uint8_t *data_buf,
int oob_required, int page);
int (*write_page)(struct nand_chip *chip, const uint8_t *buf,
int oob_required, int page);
int (*write_oob_raw)(struct nand_chip *chip, int page);
int (*read_oob_raw)(struct nand_chip *chip, int page);
int (*read_oob)(struct nand_chip *chip, int page);
int (*write_oob)(struct nand_chip *chip, int page);
};
2.3.7 struct nand_manufacturer
nand_manufacturer保存生产厂家信息,定义在drivers/mtd/nand/raw/internals.h:
/*
* NAND Flash Manufacturer ID Codes
*/
#define NAND_MFR_AMD 0x01
#define NAND_MFR_ATO 0x9b
#define NAND_MFR_EON 0x92
#define NAND_MFR_ESMT 0xc8
#define NAND_MFR_FUJITSU 0x04
#define NAND_MFR_HYNIX 0xad
#define NAND_MFR_INTEL 0x89
#define NAND_MFR_MACRONIX 0xc2
#define NAND_MFR_MICRON 0x2c
#define NAND_MFR_NATIONAL 0x8f
#define NAND_MFR_RENESAS 0x07
#define NAND_MFR_SAMSUNG 0xec // 三星厂家
#define NAND_MFR_SANDISK 0x45
#define NAND_MFR_STMICRO 0x20
#define NAND_MFR_TOSHIBA 0x98
#define NAND_MFR_WINBOND 0xef
/**
* struct nand_manufacturer_ops - NAND Manufacturer operations
* @detect: detect the NAND memory organization and capabilities
* @init: initialize all vendor specific fields (like the ->read_retry()
* implementation) if any.
* @cleanup: the ->init() function may have allocated resources, ->cleanup()
* is here to let vendor specific code release those resources.
* @fixup_onfi_param_page: apply vendor specific fixups to the ONFI parameter
* page. This is called after the checksum is verified.
*/
struct nand_manufacturer_ops {
void (*detect)(struct nand_chip *chip);
int (*init)(struct nand_chip *chip);
void (*cleanup)(struct nand_chip *chip);
void (*fixup_onfi_param_page)(struct nand_chip *chip,
struct nand_onfi_params *p);
};
/**
* struct nand_manufacturer - NAND Flash Manufacturer structure
* @name: Manufacturer name
* @id: manufacturer ID code of device.
* @ops: manufacturer operations
*/
struct nand_manufacturer {
int id; // 厂家ID
char *name; // 厂家名字
const struct nand_manufacturer_ops *ops; // 操作函数
};
2.3.8 struct nand_device
struct nand_device定义在include/linux/mtd/nand.h:
/**
* struct nand_device - NAND device
* @mtd: MTD instance attached to the NAND device
* @memorg: memory layout
* @eccreq: ECC requirements
* @rowconv: position to row address converter
* @bbt: bad block table info
* @ops: NAND operations attached to the NAND device
*
* Generic NAND object. Specialized NAND layers (raw NAND, SPI NAND, OneNAND)
* should declare their own NAND object embedding a nand_device struct (that's
* how inheritance is done).
* struct_nand_device->memorg and struct_nand_device->eccreq should be filled
* at device detection time to reflect the NAND device
* capabilities/requirements. Once this is done nanddev_init() can be called.
* It will take care of converting NAND information into MTD ones, which means
* the specialized NAND layers should never manually tweak
* struct_nand_device->mtd except for the ->_read/write() hooks.
*/
struct nand_device {
struct mtd_info mtd;
struct nand_memory_organization memorg;
struct nand_ecc_req eccreq;
struct nand_row_converter rowconv;
struct nand_bbt bbt;
const struct nand_ops *ops;
};
2.3.9 结构体关系图
2.4 核心函数
如果MTD设备只有一个分区,那么使用下面两个函数注册和注销MTD设备:
int add_mtd_device(struct mtd_info *mtd) int del_mtd_device (struct mtd_info *mtd)
如果MTD设备存在其他分区,那么使用下面两个函数注册和注销MTD设备:
int add_mtd_partitions(struct mtd_info *master,const struct mtd_partition *parts,int nbparts) int del_mtd_partitions(struct mtd_info *master)
三、MTD设备注册
3.1 add_mtd_device
add_mtd_device定义在drivers/mtd/mtdcore.c:
/**
* add_mtd_device - register an MTD device
* @mtd: pointer to new MTD device info structure
*
* Add a device to the list of MTD devices present in the system, and
* notify each currently active MTD 'user' of its arrival. Returns
* zero on success or non-zero on failure.
*/
int add_mtd_device(struct mtd_info *mtd)
{
struct mtd_notifier *not;
int i, error;
/*
* May occur, for instance, on buggy drivers which call
* mtd_device_parse_register() multiple times on the same master MTD,
* especially with CONFIG_MTD_PARTITIONED_MASTER=y.
*/
if (WARN_ONCE(mtd->dev.type, "MTD already registered\n"))
return -EEXIST;
BUG_ON(mtd->writesize == 0);
/*
* MTD drivers should implement ->_{write,read}() or
* ->_{write,read}_oob(), but not both.
*/
if (WARN_ON((mtd->_write && mtd->_write_oob) || // 校验函数指针
(mtd->_read && mtd->_read_oob)))
return -EINVAL;
if (WARN_ON((!mtd->erasesize || !mtd->_erase) &&
!(mtd->flags & MTD_NO_ERASE)))
return -EINVAL;
mutex_lock(&mtd_table_mutex); // 互斥锁
i = idr_alloc(&mtd_idr, mtd, 0, 0, GFP_KERNEL); // 为mtd设备分配index
if (i < 0) {
error = i;
goto fail_locked;
}
mtd->index = i;
mtd->usecount = 0;
/* default value if not set by driver */
if (mtd->bitflip_threshold == 0) // 计算擦除数据偏移
mtd->bitflip_threshold = mtd->ecc_strength;
if (is_power_of_2(mtd->erasesize))
mtd->erasesize_shift = ffs(mtd->erasesize) - 1;
else
mtd->erasesize_shift = 0;
if (is_power_of_2(mtd->writesize)) // 计算写入数据偏移值
mtd->writesize_shift = ffs(mtd->writesize) - 1;
else
mtd->writesize_shift = 0;
mtd->erasesize_mask = (1 << mtd->erasesize_shift) - 1; // 计算擦除数据大小掩码
mtd->writesize_mask = (1 << mtd->writesize_shift) - 1; // 计算写入数据大小掩码
/* Some chips always power up locked. Unlock them now */
if ((mtd->flags & MTD_WRITEABLE) && (mtd->flags & MTD_POWERUP_LOCK)) { // 有些芯片总是通电锁定,立即解锁(一般flash芯片都支持lock机制,在驱动上很少使用)
error = mtd_unlock(mtd, 0, mtd->size);
if (error && error != -EOPNOTSUPP)
printk(KERN_WARNING
"%s: unlock failed, writes may not work\n",
mtd->name);
/* Ignore unlock failures? */
error = 0;
}
/* Caller should have set dev.parent to match the
* physical device, if appropriate.
*/
mtd->dev.type = &mtd_devtype; // 设置设备类型
mtd->dev.class = &mtd_class; // 设置设备类 会在/syc/class创建mtd类
mtd->dev.devt = MTD_DEVT(i); // 设置设备号,关于设备号的申请是在mtdchar.c模块入口函数中完成的
dev_set_name(&mtd->dev, "mtd%d", i); // 设置设备节点名字mtd%d
dev_set_drvdata(&mtd->dev, mtd); // mtd->dev.driver_data = mtd;
of_node_get(mtd_get_of_node(mtd));
error = device_register(&mtd->dev); // 注册MTD字符设备,会在/sys/class/mtd类下创建mtd%d文件,然后mdev通过这个自动创建/dev/mtd%d这个字符设备节点
if (error)
goto fail_added;
/* Add the nvmem provider */
error = mtd_nvmem_add(mtd);
if (error)
goto fail_nvmem_add;
if (!IS_ERR_OR_NULL(dfs_dir_mtd)) {
mtd->dbg.dfs_dir = debugfs_create_dir(dev_name(&mtd->dev), dfs_dir_mtd);
if (IS_ERR_OR_NULL(mtd->dbg.dfs_dir)) {
pr_debug("mtd device %s won't show data in debugfs\n",
dev_name(&mtd->dev));
}
}
device_create(&mtd_class, mtd->dev.parent, MTD_DEVT(i) + 1, NULL, // 创建MTD字符设备,内部调用了device_register 在/sys/class/mtd下创建mtd%dro设备,然后mdev通过这个自动创建/dev/mtd%dro这个字符设备节点
"mtd%dro", i);
pr_debug("mtd: Giving out device %d to %s\n", i, mtd->name);
/* No need to get a refcount on the module containing
the notifier, since we hold the mtd_table_mutex */
list_for_each_entry(not, &mtd_notifiers, list) // 调用mtd子系统的notify机制,实现针对mtd设备添加、移除,移除notify机制,实现注册的notify hook
not->add(mtd);
mutex_unlock(&mtd_table_mutex); // 解锁
/* We _know_ we aren't being removed, because
our caller is still holding us here. So none
of this try_ nonsense, and no bitching about it
either. :) */
__module_get(THIS_MODULE);
return 0;
fail_nvmem_add:
device_unregister(&mtd->dev);
fail_added:
of_node_put(mtd_get_of_node(mtd));
idr_remove(&mtd_idr, i);
fail_locked:
mutex_unlock(&mtd_table_mutex);
return error;
}
该函数主要进行了以下操作:
(1) 对mtd原始设备必要字段以及函数指针进行校验;
(2) 在mtd_idr树中为该mtd原始设备分配节点,并返回分配的节点ID:
i = idr_alloc(&mtd_idr, mtd, 0, 0, GFP_KERNEL); // 分配ID mtd_idr是一个redix树、将mtd与新分配的ID关联
idr_alloc函数用于为mtd_idr树新增一个节点,该节点在mtd_idr树中有唯一的ID,并且将这个节点与mtd关联。通过ID就可以定位到mtd。
此外该函数第三个参数和第四个参数含义如下:为ID的起始范围,结束范围设置为0,表示mtd_idr树允许的最大ID。
全局变量mtd_idr定义在drivers/mtd/mtdcore.c:
static DEFINE_IDR(mtd_idr);
关于IDR的定义这里就不介绍了,IDR主要实现ID与数据结构的绑定具体可以参考linux内核IDR机制详解(一)。
后续字符设备及块设备注册需要该ID,比如后面设置mtd设备对应的device类型变量设备号为MTD_DEVT(i);
#define MTD_DEVT(index) MKDEV(MTD_CHAR_MAJOR, (index)*2)
主设备号为MTD_CHAR_MAJOR,即90,次设备号为index*2;
(3) 设备mtd原始设备的erasesize_shift、writesize_shift、erasesize_mask、writesize_mask等信息;
(4) 针对设置可写属性,且上电时对Flash进行lock的芯片,则调用unlock接口,进行解锁(一般Flasg芯片都支持lock机制,但在驱动上很少使用);
(5) 设置mtd原始设备对应的device类型变量所属的class为mtd_class,并设置其设备号,类型、名称、driver_data;
mtd_class定义为:
static struct class mtd_class = {
.name = "mtd",
.owner = THIS_MODULE,
.pm = MTD_CLS_PM_OPS,
};
(6) 调用device_register完成名字为mtd%d MTD字符设备的注册;
(7)调用device_create完成名字为mtd%dro MTD字符设备的创建、初始化以及注册;
(8) 调用mtd子系统的notify机制,实现针对mtd设备添加、移除,移除notify机制,实现注册的notify hook;
list_for_each_entry(not, &mtd_notifiers, list)
not->add(mtd);
list_for_each_entry函数包含三个参数,以此为pos、head、member;它实际上是一个for循环,利用传入的pos作为循环变量,从链表头head开始,逐项向后(next方向)移动pos,直至又回到head。
链表mtd_notifiers定义为:
static LIST_HEAD(mtd_notifiers);
这里实际上就是遍历这个链表得到当前时刻的元素not,类型为mtd_notifiers,然后调用not->add(mtd)方法,在这个方法里会进行名字为mtdblock%d MTD块设备的注册。
3.2 add_mtd_partitions
add_mtd_partitions定义在drivers/mtd/mtdpart.c:
/*
* This function, given a master MTD object and a partition table, creates
* and registers slave MTD objects which are bound to the master according to
* the partition definitions.
*
* For historical reasons, this function's caller only registers the master
* if the MTD_PARTITIONED_MASTER config option is set.
*/
int add_mtd_partitions(struct mtd_info *master, // MTD设备信息
const struct mtd_partition *parts, // 分区表
int nbparts) // 分区个数
{
struct mtd_part *slave;
uint64_t cur_offset = 0;
int i, ret;
printk(KERN_NOTICE "Creating %d MTD partitions on \"%s\":\n", nbparts, master->name);
for (i = 0; i < nbparts; i++) { // 遍历分区表
slave = allocate_partition(master, parts + i, i, cur_offset); // 分配mtd_part
if (IS_ERR(slave)) {
ret = PTR_ERR(slave);
goto err_del_partitions;
}
mutex_lock(&mtd_partitions_mutex);
list_add(&slave->list, &mtd_partitions); // slave添加到链表mtd_partitions
mutex_unlock(&mtd_partitions_mutex);
ret = add_mtd_device(&slave->mtd); // 为每个分区注册mtd设备,会在/dev下成成mtdblock%d文件块设备文件
if (ret) {
mutex_lock(&mtd_partitions_mutex);
list_del(&slave->list);
mutex_unlock(&mtd_partitions_mutex);
free_partition(slave);
goto err_del_partitions;
}
mtd_add_partition_attrs(slave);
/* Look for subpartitions */
parse_mtd_partitions(&slave->mtd, parts[i].types, NULL);
cur_offset = slave->offset + slave->mtd.size;
}
return 0;
err_del_partitions:
del_mtd_partitions(master);
return ret;
}
3.2.1 allocate_partition
allocate_partition定义在drivers/mtd/mtdpart.c:
static struct mtd_part *allocate_partition(struct mtd_info *parent,
const struct mtd_partition *part, int partno,
uint64_t cur_offset)
{
int wr_alignment = (parent->flags & MTD_NO_ERASE) ? parent->writesize :
parent->erasesize;
struct mtd_part *slave;
u32 remainder;
char *name;
u64 tmp;
/* allocate the partition structure */
slave = kzalloc(sizeof(*slave), GFP_KERNEL);
name = kstrdup(part->name, GFP_KERNEL);
if (!name || !slave) {
printk(KERN_ERR"memory allocation error while creating partitions for \"%s\"\n",
parent->name);
kfree(name);
kfree(slave);
return ERR_PTR(-ENOMEM);
}
/* set up the MTD object for this partition */
slave->mtd.type = parent->type;
slave->mtd.flags = parent->orig_flags & ~part->mask_flags;
slave->mtd.orig_flags = slave->mtd.flags;
slave->mtd.size = part->size;
slave->mtd.writesize = parent->writesize;
slave->mtd.writebufsize = parent->writebufsize;
slave->mtd.oobsize = parent->oobsize;
slave->mtd.oobavail = parent->oobavail;
slave->mtd.subpage_sft = parent->subpage_sft;
slave->mtd.pairing = parent->pairing;
slave->mtd.name = name;
slave->mtd.owner = parent->owner;
/* NOTE: Historically, we didn't arrange MTDs as a tree out of
* concern for showing the same data in multiple partitions.
* However, it is very useful to have the master node present,
* so the MTD_PARTITIONED_MASTER option allows that. The master
* will have device nodes etc only if this is set, so make the
* parent conditional on that option. Note, this is a way to
* distinguish between the master and the partition in sysfs.
*/
slave->mtd.dev.parent = IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER) || mtd_is_partition(parent) ?
&parent->dev :
parent->dev.parent;
slave->mtd.dev.of_node = part->of_node;
if (parent->_read)
slave->mtd._read = part_read;
if (parent->_write)
slave->mtd._write = part_write;
if (parent->_panic_write)
slave->mtd._panic_write = part_panic_write;
if (parent->_point && parent->_unpoint) {
slave->mtd._point = part_point;
slave->mtd._unpoint = part_unpoint;
}
if (parent->_read_oob)
slave->mtd._read_oob = part_read_oob;
if (parent->_write_oob)
slave->mtd._write_oob = part_write_oob;
if (parent->_read_user_prot_reg)
slave->mtd._read_user_prot_reg = part_read_user_prot_reg;
if (parent->_read_fact_prot_reg)
slave->mtd._read_fact_prot_reg = part_read_fact_prot_reg;
if (parent->_write_user_prot_reg)
slave->mtd._write_user_prot_reg = part_write_user_prot_reg;
if (parent->_lock_user_prot_reg)
slave->mtd._lock_user_prot_reg = part_lock_user_prot_reg;
if (parent->_get_user_prot_info)
slave->mtd._get_user_prot_info = part_get_user_prot_info;
if (parent->_get_fact_prot_info)
slave->mtd._get_fact_prot_info = part_get_fact_prot_info;
if (parent->_sync)
slave->mtd._sync = part_sync;
if (!partno && !parent->dev.class && parent->_suspend &&
parent->_resume) {
slave->mtd._suspend = part_suspend;
slave->mtd._resume = part_resume;
}
if (parent->_writev)
slave->mtd._writev = part_writev;
if (parent->_lock)
slave->mtd._lock = part_lock;
if (parent->_unlock)
slave->mtd._unlock = part_unlock;
if (parent->_is_locked)
slave->mtd._is_locked = part_is_locked;
if (parent->_block_isreserved)
slave->mtd._block_isreserved = part_block_isreserved;
if (parent->_block_isbad)
slave->mtd._block_isbad = part_block_isbad;
if (parent->_block_markbad)
slave->mtd._block_markbad = part_block_markbad;
if (parent->_max_bad_blocks)
slave->mtd._max_bad_blocks = part_max_bad_blocks;
if (parent->_get_device)
slave->mtd._get_device = part_get_device;
if (parent->_put_device)
slave->mtd._put_device = part_put_device;
slave->mtd._erase = part_erase;
slave->parent = parent;
slave->offset = part->offset;
if (slave->offset == MTDPART_OFS_APPEND)
slave->offset = cur_offset;
if (slave->offset == MTDPART_OFS_NXTBLK) {
tmp = cur_offset;
slave->offset = cur_offset;
remainder = do_div(tmp, wr_alignment);
if (remainder) {
slave->offset += wr_alignment - remainder;
printk(KERN_NOTICE "Moving partition %d: "
"0x%012llx -> 0x%012llx\n", partno,
(unsigned long long)cur_offset, (unsigned long long)slave->offset);
}
}
if (slave->offset == MTDPART_OFS_RETAIN) {
slave->offset = cur_offset;
if (parent->size - slave->offset >= slave->mtd.size) {
slave->mtd.size = parent->size - slave->offset
- slave->mtd.size;
} else {
printk(KERN_ERR "mtd partition \"%s\" doesn't have enough space: %#llx < %#llx, disabled\n",
part->name, parent->size - slave->offset,
slave->mtd.size);
/* register to preserve ordering */
goto out_register;
}
}
if (slave->mtd.size == MTDPART_SIZ_FULL)
slave->mtd.size = parent->size - slave->offset;
printk(KERN_NOTICE "0x%012llx-0x%012llx : \"%s\"\n", (unsigned long long)slave->offset,
(unsigned long long)(slave->offset + slave->mtd.size), slave->mtd.name);
/* let's do some sanity checks */
if (slave->offset >= parent->size) {
/* let's register it anyway to preserve ordering */
slave->offset = 0;
slave->mtd.size = 0;
/* Initialize ->erasesize to make add_mtd_device() happy. */
slave->mtd.erasesize = parent->erasesize;
printk(KERN_ERR"mtd: partition \"%s\" is out of reach -- disabled\n",
part->name);
goto out_register;
}
if (slave->offset + slave->mtd.size > parent->size) {
slave->mtd.size = parent->size - slave->offset;
printk(KERN_WARNING"mtd: partition \"%s\" extends beyond the end of device \"%s\" -- size truncated to %#llx\n",
part->name, parent->name, (unsigned long long)slave->mtd.size);
}
if (parent->numeraseregions > 1) {
/* Deal with variable erase size stuff */
int i, max = parent->numeraseregions;
u64 end = slave->offset + slave->mtd.size;
struct mtd_erase_region_info *regions = parent->eraseregions;
/* Find the first erase regions which is part of this
* partition. */
for (i = 0; i < max && regions[i].offset <= slave->offset; i++)
;
/* The loop searched for the region _behind_ the first one */
if (i > 0)
i--;
/* Pick biggest erasesize */
for (; i < max && regions[i].offset < end; i++) {
if (slave->mtd.erasesize < regions[i].erasesize) {
slave->mtd.erasesize = regions[i].erasesize;
}
}
BUG_ON(slave->mtd.erasesize == 0);
} else {
/* Single erase size */
slave->mtd.erasesize = parent->erasesize;
}
/*
* Slave erasesize might differ from the master one if the master
* exposes several regions with different erasesize. Adjust
* wr_alignment accordingly.
*/
if (!(slave->mtd.flags & MTD_NO_ERASE))
wr_alignment = slave->mtd.erasesize;
tmp = part_absolute_offset(parent) + slave->offset;
remainder = do_div(tmp, wr_alignment);
if ((slave->mtd.flags & MTD_WRITEABLE) && remainder) {
/* Doesn't start on a boundary of major erase size */
/* FIXME: Let it be writable if it is on a boundary of
* _minor_ erase size though */
slave->mtd.flags &= ~MTD_WRITEABLE;
printk(KERN_WARNING"mtd: partition \"%s\" doesn't start on an erase/write block boundary -- force read-only\n",
part->name);
}
tmp = part_absolute_offset(parent) + slave->mtd.size;
remainder = do_div(tmp, wr_alignment);
if ((slave->mtd.flags & MTD_WRITEABLE) && remainder) {
slave->mtd.flags &= ~MTD_WRITEABLE;
printk(KERN_WARNING"mtd: partition \"%s\" doesn't end on an erase/write block -- force read-only\n",
part->name);
}
mtd_set_ooblayout(&slave->mtd, &part_ooblayout_ops);
slave->mtd.ecc_step_size = parent->ecc_step_size;
slave->mtd.ecc_strength = parent->ecc_strength;
slave->mtd.bitflip_threshold = parent->bitflip_threshold;
if (parent->_block_isbad) {
uint64_t offs = 0;
while (offs < slave->mtd.size) {
if (mtd_block_isreserved(parent, offs + slave->offset))
slave->mtd.ecc_stats.bbtblocks++;
else if (mtd_block_isbad(parent, offs + slave->offset))
slave->mtd.ecc_stats.badblocks++;
offs += slave->mtd.erasesize;
}
}
out_register:
return slave;
}
3.2.2 mtd_partitions
链表mtd_partitions定义在drivers/mtd/mtdpart.c:
static LIST_HEAD(mtd_partitions);
3.3 mtd_device_register
宏mtd_device_register定义在include/linux/mtd/mtd.h:
#define mtd_device_register(master, parts, nr_parts) \
mtd_device_parse_register(master, NULL, NULL, parts, nr_parts)
函数mtd_device_parse_register定义在drivers/mtd/mtdcore.c:
/**
* mtd_device_parse_register - parse partitions and register an MTD device.
*
* @mtd: the MTD device to register
* @types: the list of MTD partition probes to try, see
* 'parse_mtd_partitions()' for more information
* @parser_data: MTD partition parser-specific data
* @parts: fallback partition information to register, if parsing fails;
* only valid if %nr_parts > %0
* @nr_parts: the number of partitions in parts, if zero then the full
* MTD device is registered if no partition info is found
*
* This function aggregates MTD partitions parsing (done by
* 'parse_mtd_partitions()') and MTD device and partitions registering. It
* basically follows the most common pattern found in many MTD drivers:
*
* * If the MTD_PARTITIONED_MASTER option is set, then the device as a whole is
* registered first.
* * Then It tries to probe partitions on MTD device @mtd using parsers
* specified in @types (if @types is %NULL, then the default list of parsers
* is used, see 'parse_mtd_partitions()' for more information). If none are
* found this functions tries to fallback to information specified in
* @parts/@nr_parts.
* * If no partitions were found this function just registers the MTD device
* @mtd and exits.
*
* Returns zero in case of success and a negative error code in case of failure.
*/
int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types,
struct mtd_part_parser_data *parser_data,
const struct mtd_partition *parts, // 分区表
int nr_parts) // 分区个数
{
int ret;
mtd_set_dev_defaults(mtd);
if (IS_ENABLED(CONFIG_MTD_PARTITIONED_MASTER)) { // 将Nand Flash当做一个分区注册进内核
ret = add_mtd_device(mtd); // 注册MTD设备
if (ret)
return ret;
}
/* Prefer parsed partitions over driver-provided fallback */
ret = parse_mtd_partitions(mtd, types, parser_data);
if (ret > 0)
ret = 0;
else if (nr_parts) // 注册MTD设备
ret = add_mtd_partitions(mtd, parts, nr_parts);
else if (!device_is_registered(&mtd->dev))
ret = add_mtd_device(mtd);
else
ret = 0;
if (ret)
goto out;
/*
* FIXME: some drivers unfortunately call this function more than once.
* So we have to check if we've already assigned the reboot notifier.
*
* Generally, we can make multiple calls work for most cases, but it
* does cause problems with parse_mtd_partitions() above (e.g.,
* cmdlineparts will register partitions more than once).
*/
WARN_ONCE(mtd->_reboot && mtd->reboot_notifier.notifier_call,
"MTD already registered\n");
if (mtd->_reboot && !mtd->reboot_notifier.notifier_call) {
mtd->reboot_notifier.notifier_call = mtd_reboot_notifier;
register_reboot_notifier(&mtd->reboot_notifier);
}
out:
if (ret && device_is_registered(&mtd->dev))
del_mtd_device(mtd); // 卸载MTD设备
return ret;
}
四、mtdblock.c
之前我们已经介绍过mtdbloc.c文件,该文件实现了MTD块设备相关接口,我们直接定位到drivers/mtd/mtdblock.c文件,并对源码进行解析。
4.1 模块入口函数
我们定位到MTD块设备模块入口函数:
static struct mtd_blktrans_ops mtdblock_tr = { // 这里面定义了MTD块设备相关信息以及操作函数
.name = "mtdblock",
.major = MTD_BLOCK_MAJOR, // MTD块设备主设备号 31
.part_bits = 0, // 磁盘设备分区位数 0表示不分区 1表示有2个分区 2表示有4个分区...
.blksize = 512, // 扇区大小
.open = mtdblock_open,
.flush = mtdblock_flush,
.release = mtdblock_release,
.readsect = mtdblock_readsect,
.writesect = mtdblock_writesect,
.add_mtd = mtdblock_add_mtd,
.remove_dev = mtdblock_remove_dev,
.owner = THIS_MODULE,
};
static int __init init_mtdblock(void)
{
return register_mtd_blktrans(&mtdblock_tr);
}
4.2 register_mtd_blktrans
定位到register_mtd_blktrans函数,该函数位于drivers/mtd/mtd_blkdevs.c:
int register_mtd_blktrans(struct mtd_blktrans_ops *tr)
{
struct mtd_info *mtd;
int ret;
/* Register the notifier if/when the first device type is
registered, to prevent the link/init ordering from fucking
us over. */
if (!blktrans_notifier.list.next) // next指向NULL,进入
register_mtd_user(&blktrans_notifier); // 注册blktrans_notifier到mtd_notifiers链表
mutex_lock(&mtd_table_mutex);
ret = register_blkdev(tr->major, tr->name); // 注册块设备,主设备号为MTD_BLOCK_MAJOR,定义为31
if (ret < 0) {
printk(KERN_WARNING "Unable to register %s block device on major %d: %d\n",
tr->name, tr->major, ret);
mutex_unlock(&mtd_table_mutex);
return ret;
}
if (ret)
tr->major = ret;
tr->blkshift = ffs(tr->blksize) - 1;
INIT_LIST_HEAD(&tr->devs);
list_add(&tr->list, &blktrans_majors); // 注册tr到链表blktrans_majors
mtd_for_each_device(mtd)
if (mtd->type != MTD_ABSENT)
tr->add_mtd(tr, mtd);
mutex_unlock(&mtd_table_mutex);
return 0;
}
该函数主要包含三部分:
- 调用register_mtd_user:注册blktrans_notifier到链表mtd_notifiers,然后遍历全局变量mtd_idr获取mtd,执行blktrans_notify_add(mtd);
- 调用register_blkdev注册块设备,主设备号为31,块设备名称为mtdblock;
- 注册mtdblock_tr到链表blktrans_majors,链表定义为static LIST_HEAD(blktrans_majors);;
- 然后遍历全局变量mtd_idr获取mtd,执行mtdblock_add_mtd(mtdblock_tr,mtd);
4.2.1 mtd_notifier
mtd_notifier定义在include/linux/mtd/mtd.h:
struct mtd_notifier {
void (*add)(struct mtd_info *mtd);
void (*remove)(struct mtd_info *mtd);
struct list_head list;
};
4.2.2 blktrans_notifier
这里我们关注一下register_mtd_user(&blktrans_notifier),变量blktrans_notifier,定义在drivers/mtd/mtd_blkdevs.c:
static struct mtd_notifier blktrans_notifier = {
.add = blktrans_notify_add,
.remove = blktrans_notify_remove,
};
4.2.3 register_mtd_user
register_mtd_user函数将new->list添加到链表mtd_notifiers:
/**
* register_mtd_user - register a 'user' of MTD devices.
* @new: pointer to notifier info structure
*
* Registers a pair of callbacks function to be called upon addition
* or removal of MTD devices. Cau ses the 'add' callback to be immediately
* invoked for each MTD device currently present in the system.
*/
void register_mtd_user (struct mtd_notifier *new)
{
struct mtd_info *mtd;
mutex_lock(&mtd_table_mutex); // 互斥锁
list_add(&new->list, &mtd_notifiers); // 加入链表
__module_get(THIS_MODULE);
mtd_for_each_device(mtd) // 遍历mtd_idr,得到mtd
new->add(mtd); // 最终执行blktrans_notify_add(mtd)
mutex_unlock(&mtd_table_mutex); // 解锁
}
4.2.4 mtd_for_each_device
mtd_for_each_device宏定义在drivers/mtd/mtdcore.h:
#define mtd_for_each_device(mtd) \
for ((mtd) = __mtd_next_device(0); \
(mtd) != NULL; \
(mtd) = __mtd_next_device(mtd->index + 1))
__mtd_next_device定义在drivers/mtd/mtdcore.c:
struct mtd_info *__mtd_next_device(int i)
{
return idr_get_next(&mtd_idr, &i);
}
这里实际上就是去遍历mtd_idr这个redix树上的所有节点,得到每个节点关联的mtd。
4.2.5 blktrans_notify_add
然后进入blktrans_notifier变量的blktrans_notify_add ()函数。
static void blktrans_notify_add(struct mtd_info *mtd)
{
struct mtd_blktrans_ops *tr;
if (mtd->type == MTD_ABSENT)
return;
list_for_each_entry(tr, &blktrans_majors, list) // 遍历blktrans_majors链表
tr->add_mtd(tr, mtd); // 执行mtd_blktrans_ops结构体的add_mtd
}
在MTD块设备驱动入口函数中,会将mtdblock_tr添加到链表blktrans_majors,所以这里遍历blktrans_majors链表,实际上得到的tr就是mtdblock_tr:然后执行mtdblock_tr.add_mtd(mtdblock_tr,mtd)方法。
mtdblock_tr的add_mtd函数,就是mtdblock_add_mtd函数。
4.2.6 在mtdblock_add_mtd
static void mtdblock_add_mtd(struct mtd_blktrans_ops *tr, struct mtd_info *mtd)
{
struct mtdblk_dev *dev = kzalloc(sizeof(*dev), GFP_KERNEL);
if (!dev)
return;
dev->mbd.mtd = mtd; // 设置MTD原始设备
dev->mbd.devnum = mtd->index; // 设置起始次设备号
dev->mbd.size = mtd->size >> 9; // 总扇区个数
dev->mbd.tr = tr;
if (!(mtd->flags & MTD_WRITEABLE))
dev->mbd.readonly = 1;
if (add_mtd_blktrans_dev(&dev->mbd))
kfree(dev);
}
mtdblock_add_mtd函数:
- 分配了一个mtdblk_dev结构体遍历dev:
- 初始化dev成员;
- 调用add_mtd_blktrans_dev(dev->mtd);
mtdblk_dev数据结构实际描述的就是一个MTD块设备,其包含MTD原始设备,定义在drivers/mtd/mtdblock.c:
struct mtdblk_dev {
struct mtd_blktrans_dev mbd;
int count;
struct mutex cache_mutex;
unsigned char *cache_data;
unsigned long cache_offset;
unsigned int cache_size;
enum { STATE_EMPTY, STATE_CLEAN, STATE_DIRTY } cache_state;
};
struct mtd_blktrans_dev {
struct mtd_blktrans_ops *tr; // MTD设备相关信息以及操作函数
struct list_head list;
struct mtd_info *mtd; // MTD原始设备
struct mutex lock;
int devnum; // 用于计算起始次设备号(devnum<<tr->part_bits,左移0位),由于一个MTD块设备可能存在若干个分区,假设有2个分区 那两个分区次设备号就是devnum+1,devnum+2,其中devnum表示整个磁盘
bool bg_stop;
unsigned long size; // 扇区个数
int readonly;
int open;
struct kref ref;
struct gendisk *disk; // 磁盘设备
struct attribute_group *disk_attributes;
struct request_queue *rq; // 请求队列
struct list_head rq_list;
struct blk_mq_tag_set *tag_set; // 标签集
spinlock_t queue_lock;
void *priv;
fmode_t file_mode;
};
4.2.7 add_mtd_blktrans_dev
add_mtd_blktrans_dev定义在drivers/mtd/mtd_blkdevs.c:
int add_mtd_blktrans_dev(struct mtd_blktrans_dev *new)
{
struct mtd_blktrans_ops *tr = new->tr;
struct mtd_blktrans_dev *d;
int last_devnum = -1;
struct gendisk *gd;
int ret;
if (mutex_trylock(&mtd_table_mutex)) {
mutex_unlock(&mtd_table_mutex);
BUG();
}
mutex_lock(&blktrans_ref_mutex);
list_for_each_entry(d, &tr->devs, list) { // tr->devs是个链表,遍历链表得到mtd_blktrans_dev
if (new->devnum == -1) { // new设备未设置devnum号,分配一个空闲的devnum,默认从0开始分配,逐渐递增.....
/* Use first free number */
if (d->devnum != last_devnum+1) {
/* Found a free devnum. Plug it in here */
new->devnum = last_devnum+1; // 新的devnum
list_add_tail(&new->list, &d->list); // 将当前new添加到链表尾部
goto added;
}
} else if (d->devnum == new->devnum) { // new设置的devnum已经被占用
/* Required number taken */
mutex_unlock(&blktrans_ref_mutex);
return -EBUSY;
} else if (d->devnum > new->devnum) {
/* Required number was free */
list_add_tail(&new->list, &d->list);
goto added;
}
last_devnum = d->devnum; // 更新最新设备分配的次设备号
}
ret = -EBUSY;
if (new->devnum == -1)
new->devnum = last_devnum+1;
/* Check that the device and any partitions will get valid
* minor numbers and that the disk naming code below can cope
* with this number. */
if (new->devnum > (MINORMASK >> tr->part_bits) ||
(tr->part_bits && new->devnum >= 27 * 26)) {
mutex_unlock(&blktrans_ref_mutex);
goto error1;
}
list_add_tail(&new->list, &tr->devs);
added:
mutex_unlock(&blktrans_ref_mutex);
mutex_init(&new->lock);
kref_init(&new->ref);
if (!tr->writesect)
new->readonly = 1;
/* Create gendisk */
ret = -ENOMEM;
gd = alloc_disk(1 << tr->part_bits); // 分配一个gendisk结构体,设置分区个数
if (!gd)
goto error2;
new->disk = gd;
gd->private_data = new; // 私有数据
gd->major = tr->major; // 设置主设备号
gd->first_minor = (new->devnum) << tr->part_bits; // 设置起始次设备号
gd->fops = &mtd_block_ops; // 设置块设备操作函数
if (tr->part_bits) //0
if (new->devnum < 26)
snprintf(gd->disk_name, sizeof(gd->disk_name),
"%s%c", tr->name, 'a' + new->devnum);
else
snprintf(gd->disk_name, sizeof(gd->disk_name),
"%s%c%c", tr->name,
'a' - 1 + new->devnum / 26,
'a' + new->devnum % 26);
else // 设置磁盘名 即/dev/mtdblock%d
snprintf(gd->disk_name, sizeof(gd->disk_name),
"%s%d", tr->name, new->devnum);
set_capacity(gd, ((u64)new->size * tr->blksize) >> 9); // 设置容量 单位扇区
/* Create the request queue */
spin_lock_init(&new->queue_lock);
INIT_LIST_HEAD(&new->rq_list);
new->tag_set = kzalloc(sizeof(*new->tag_set), GFP_KERNEL);
if (!new->tag_set)
goto error3;
new->rq = blk_mq_init_sq_queue(new->tag_set, &mtd_mq_ops, 2,
BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_BLOCKING); // 设置请求队列,同时设置块设备驱动行为的回调函数为mtd_mq_ops
if (IS_ERR(new->rq)) {
ret = PTR_ERR(new->rq);
new->rq = NULL;
goto error4;
}
if (tr->flush)
blk_queue_write_cache(new->rq, true, false);
new->rq->queuedata = new;
blk_queue_logical_block_size(new->rq, tr->blksize);
blk_queue_flag_set(QUEUE_FLAG_NONROT, new->rq);
blk_queue_flag_clear(QUEUE_FLAG_ADD_RANDOM, new->rq);
if (tr->discard) {
blk_queue_flag_set(QUEUE_FLAG_DISCARD, new->rq);
blk_queue_max_discard_sectors(new->rq, UINT_MAX);
}
gd->queue = new->rq; // 设置请求队列
if (new->readonly)
set_disk_ro(gd, 1);
device_add_disk(&new->mtd->dev, gd, NULL); // 向内核注册gendisk
if (new->disk_attributes) {
ret = sysfs_create_group(&disk_to_dev(gd)->kobj,
new->disk_attributes);
WARN_ON(ret);
}
return 0;
error4:
kfree(new->tag_set);
error3:
put_disk(new->disk);
error2:
list_del(&new->list);
error1:
return ret;
}
从该函数我们可以看到无论注册多少个MTD块设备,其主设备号都是31,只是次设备号不一样而已,主设备号用来表示一个特定的驱动程序。次设备号用来表示使用该驱动程序的各设备。
4.2.8 mtd_block_ops
这里我们关注一下MTD块设备操作集mtd_block_ops,定义在drivers/mtd/mtd_blkdevs.c。
static const struct block_device_operations mtd_block_ops = {
.owner = THIS_MODULE,
.open = blktrans_open,
.release = blktrans_release,
.ioctl = blktrans_ioctl,
.getgeo = blktrans_getgeo,
};
其中部分函数指针的意义:
- open:当打开一个MTD块设备的时候被调用;
- release:当关闭一个MTD块设备的时候被调用;
- getgeo:获取驱动器的集合信息,获取到的信息会被填充在一个hd_geometry结构中;
- ioctl:对MTD块设备进行一些特殊操作时调用;
4.2.9 blktrans_open
static int blktrans_open(struct block_device *bdev, fmode_t mode)
{
struct mtd_blktrans_dev *dev = blktrans_dev_get(bdev->bd_disk);
int ret = 0;
if (!dev)
return -ERESTARTSYS; /* FIXME: busy loop! -arnd*/
mutex_lock(&mtd_table_mutex);
mutex_lock(&dev->lock);
if (dev->open)
goto unlock;
kref_get(&dev->ref);
__module_get(dev->tr->owner);
if (!dev->mtd)
goto unlock;
if (dev->tr->open) {
ret = dev->tr->open(dev); // 实际上调用了mtd_blktrans_ops的open函数
if (ret)
goto error_put;
}
ret = __get_mtd_device(dev->mtd);
if (ret)
goto error_release;
dev->file_mode = mode;
unlock:
dev->open++;
mutex_unlock(&dev->lock);
mutex_unlock(&mtd_table_mutex);
blktrans_dev_put(dev);
return ret;
error_release:
if (dev->tr->release)
dev->tr->release(dev);
error_put:
module_put(dev->tr->owner);
kref_put(&dev->ref, blktrans_dev_release);
mutex_unlock(&dev->lock);
mutex_unlock(&mtd_table_mutex);
blktrans_dev_put(dev);
4.2.10 blktrans_ioctl
static int blktrans_ioctl(struct block_device *bdev, fmode_t mode,
unsigned int cmd, unsigned long arg)
{
struct mtd_blktrans_dev *dev = blktrans_dev_get(bdev->bd_disk);
int ret = -ENXIO;
if (!dev)
return ret;
mutex_lock(&dev->lock);
if (!dev->mtd)
goto unlock;
switch (cmd) {
case BLKFLSBUF:
ret = dev->tr->flush ? dev->tr->flush(dev) : 0;
break;
default:
ret = -ENOTTY;
}
unlock:
mutex_unlock(&dev->lock);
blktrans_dev_put(dev);
return ret;
}
4.2.11 mtd_mq_ops
这里我们关注一下MTD块设备驱动mq的操作集合,定义在drivers/mtd/mtd_blkdevs.c。
static const struct blk_mq_ops mtd_mq_ops = {
.queue_rq = mtd_queue_rq,
};
在上一节分析我们已经知道将request请求派发给块设备驱动的时候会被调用queue_rq函数,该函数本质上就是进行磁盘和内存之间的数据交互操作。比如将内存数据写入磁盘、或者从磁盘读取数据到内存等。
static blk_status_t mtd_queue_rq(struct blk_mq_hw_ctx *hctx,
const struct blk_mq_queue_data *bd)
{
struct mtd_blktrans_dev *dev;
dev = hctx->queue->queuedata;
if (!dev) {
blk_mq_start_request(bd->rq);
return BLK_STS_IOERR;
}
spin_lock_irq(&dev->queue_lock);
list_add_tail(&bd->rq->queuelist, &dev->rq_list);
mtd_blktrans_work(dev); // 这里就不细究了,读取操作会调用mtdblock_tr.readsect、写入操作会调用mtdblock_tr.writesect,有兴趣自己研究哈
spin_unlock_irq(&dev->queue_lock);
return BLK_STS_OK;
}
4.3 MTD块设备流程图
register_mtd_blktrans函数执行流程如图:
MTD块设备的入口函数:
- 将blktrans_notifier添加到mtd_notifiers链表中;
- 上图第一个双向循环里mtd_idr树只有根节点,所以并不会进入循环,循环内这块代码不会执行;
- 然后接着注册块设备号主设备号,主设备号为31,块设备名称为mtdblock;
- 然后进入下面第二个循环里,同理,第二个循环也不会进入。
然后在add_mtd_device(mtd)函数中:
- 为mtd原始设备分配节点;
- 设置mtd原始设备的erasesize_shift、writesize_shift、erasesize_mask、writesize_mask等信息;
- 设置mtd原始设备对应的device类型变量所属的class为mtd_class,并设置其设备号,类型、名称、driver_data;调用device_register完成名字为mtd%d MTD字符设备的注册;
- 调用device_create完成名字为mtd%dro MTD字符设备的创建、初始化以及注册;
- 遍历blktrans_notifier,当查找到有blktrans_notifier时,就调用blktrans_notifier->add(mtd):
- 分配gendisk结构体,设置成员参数:
- private_data;
- 设置主设备号major(MTD_BLOCK_MAJOR,值为31);
- 设置起始次设备号first_minor(如果注册了多个MTD设备,该值是逐渐递增的);
- 磁盘设备disk_name,设置为mtdblock%d,会在/dev下创建该文件;
- 块设备操作集fops;
- 初始化请求队列;
- 最后注册gendisk。
比如开发板启动后,我们加载Nand Flash驱动后,可以查看到如下信息:
[root@zy:/]# ls /sys/class/mtd/ -l total 0 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd0 -> ../../devices/virtual/mtd/mtd0 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd0ro -> ../../devices/virtual/mtd/mtd0ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd1 -> ../../devices/virtual/mtd/mtd1 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd1ro -> ../../devices/virtual/mtd/mtd1ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd2 -> ../../devices/virtual/mtd/mtd2 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd2ro -> ../../devices/virtual/mtd/mtd2ro lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd3 -> ../../devices/virtual/mtd/mtd3 lrwxrwxrwx 1 0 0 0 Jan 1 01:19 mtd3ro -> ../../devices/virtual/mtd/mtd3ro [root@zy:/]# ls -l /dev/mtd* crw-rw---- 1 0 0 90, 0 Jan 1 00:00 /dev/mtd0 crw-rw---- 1 0 0 90, 1 Jan 1 00:00 /dev/mtd0ro crw-rw---- 1 0 0 90, 2 Jan 1 00:00 /dev/mtd1 crw-rw---- 1 0 0 90, 3 Jan 1 00:00 /dev/mtd1ro crw-rw---- 1 0 0 90, 4 Jan 1 00:00 /dev/mtd2 crw-rw---- 1 0 0 90, 5 Jan 1 00:00 /dev/mtd2ro crw-rw---- 1 0 0 90, 6 Jan 1 00:00 /dev/mtd3 crw-rw---- 1 0 0 90, 7 Jan 1 00:00 /dev/mtd3ro brw-rw---- 1 0 0 31, 0 Jan 1 00:00 /dev/mtdblock0 brw-rw---- 1 0 0 31, 1 Jan 1 00:00 /dev/mtdblock1 brw-rw---- 1 0 0 31, 2 Jan 1 00:00 /dev/mtdblock2 brw-rw---- 1 0 0 31, 3 Jan 1 00:00 /dev/mtdblock3
五、mtdchar.c
之前我们已经介绍过mtdchar.c文件,该文件实现了MTD字符设备相关接口,我们直接定位到drivers/mtd/mtdchar.c文件,并对源码进行解析。
5.1 模块入口函数
static const struct file_operations mtd_fops = { // 字符设备操作集
.owner = THIS_MODULE,
.llseek = mtdchar_lseek,
.read = mtdchar_read,
.write = mtdchar_write,
.unlocked_ioctl = mtdchar_unlocked_ioctl,
#ifdef CONFIG_COMPAT
.compat_ioctl = mtdchar_compat_ioctl,
#endif
.open = mtdchar_open,
.release = mtdchar_close,
.mmap = mtdchar_mmap,
#ifndef CONFIG_MMU
.get_unmapped_area = mtdchar_get_unmapped_area,
.mmap_capabilities = mtdchar_mmap_capabilities,
#endif
};
int __init init_mtdchar(void)
{
int ret;
ret = __register_chrdev(MTD_CHAR_MAJOR, 0, 1 << MINORBITS, // MTD字符设备主设备号90, MINORBITS=20
"mtd", &mtd_fops); // 字符设备名称为mtd%d
if (ret < 0) {
pr_err("Can't allocate major number %d for MTD\n",
MTD_CHAR_MAJOR);
return ret;
}
return ret;
}
5.2 __register_chrdev
定位到__register_chrdev函数,该函数位于fs/char_dev.c:
/**
* __register_chrdev() - create and register a cdev occupying a range of minors
* @major: major device number or 0 for dynamic allocation
* @baseminor: first of the requested range of minor numbers
* @count: the number of minor numbers required
* @name: name of this range of devices
* @fops: file operations associated with this devices
*
* If @major == 0 this functions will dynamically allocate a major and return
* its number.
*
* If @major > 0 this function will attempt to reserve a device with the given
* major number and will return zero on success.
*
* Returns a -ve errno on failure.
*
* The name of this device has nothing to do with the name of the device in
* /dev. It only helps to keep track of the different owners of devices. If
* your module name has only one type of devices it's ok to use e.g. the name
* of the module here.
*/
int __register_chrdev(unsigned int major, unsigned int baseminor,
unsigned int count, const char *name,
const struct file_operations *fops)
{
struct char_device_struct *cd;
struct cdev *cdev;
int err = -ENOMEM;
cd = __register_chrdev_region(major, baseminor, count, name); // 静态注册一组字符设备号
if (IS_ERR(cd))
return PTR_ERR(cd);
cdev = cdev_alloc(); // 动态申请字符设备
if (!cdev)
goto out2;
cdev->owner = fops->owner; // 初始化字符设备
cdev->ops = fops;
kobject_set_name(&cdev->kobj, "%s", name);
err = cdev_add(cdev, MKDEV(cd->major, baseminor), count); // 将字符设备注册到系统
if (err)
goto out;
cd->cdev = cdev;
return major ? 0 : cd->major;
out:
kobject_put(&cdev->kobj);
out2:
kfree(__unregister_chrdev_region(cd->major, baseminor, count));
return err;
}
实际上我们发现模块入口函数中主要进行了:
- 字符设备号的申请,主设备号90,次设备号数量1<<20;
- 字符设备的动态申请;
- 字符设备的注册;
但是这里并没有创建class类、以及类下的文件,这一块是在add_mtd_device中实现的:
- 调用class_create、device_create生成/sys/class下的class类(这里为mtd)以及class类下的dev文件,供mdev程序扫描生成/dev下的节点;
参考文章
[2]痞子衡嵌入式:并行NAND接口标准(ONFI)及SLC Raw NAND简介


浙公网安备 33010602011771号