先前的文章《虚拟文件系统 (VFS)-基于linux3.10》和《UBIFS文件系统》只是对文件系统进行各层的分析,并没有连贯到读写flash。透过本文可以知道ubifs文件系统发出的读在linux操作系统上是到底是如何完成的。
NAND设备
Linux将裸NAND(区别于emmc、usbstick)归纳到MTD设备类型里,这类设备通常相关的操作通常位于drivers/mtd/nand目录下。
NAND驱动加载过程
图1.1 nand设备初始化
对于使用nand作为存储设备架构的系统,通常会在drivers/mtd/nand目录下有各个厂商定义定用于控制自己NAND控制器驱动,比如海思3516平台会在该目录下有一个hinfc610文件夹,该文件夹里的就是该芯片的NAND驱动,设备的注册过程类似于i2c等设备的驱动,
这里就以图1.1中给出的代码流程来分析该NAND驱动(代码结构非常明晰)的初始化过程。Probe方法是在注册设备时常常会回调的函数,这里就从该函数着手。
ambarella_nand_get_resource首先获取nand相关的资源,这里的获取资源和以前的linux版本区别不大,只是有几个是关于设备树解析,设备树的内容在《 Linux系统启动那些事—基于Linux 3.10内核》一文已经详细解释过了,这里跳过。
ambarella_nand_init_chip(nand_info,pdev->dev.of_node)用于初始化该芯片一些信息,包括ecc校验是软件还是硬件完成、ecc校验的比特数、FLASH大小、nand是否写保护等,其中比较关心的方法是nand读写方法。
chip->chip_delay = 0;
chip->controller = &nand_info->controller;
chip->read_byte = amb_nand_read_byte;
chip->read_word = amb_nand_read_word;
chip->write_buf = amb_nand_write_buf;
chip->read_buf = amb_nand_read_buf;
chip->select_chip = amb_nand_select_chip;
chip->cmd_ctrl = amb_nand_cmd_ctrl;
chip->dev_ready = amb_nand_dev_ready;
chip->waitfunc = amb_nand_waitfunc;
chip->cmdfunc = amb_nand_cmdfunc;
该函数的最后将chip指针赋值给nand_info的->mtd.priv,这样方便查找chip。
nand_info->mtd.priv = chip;
这里不妨来看看读实现的细节,尽管这是下一节MTD的内容。
static u16 amb_nand_read_word(struct mtd_info *mtd)
{
struct ambarella_nand_info *nand_info;
u16 *data;
nand_info = (struct ambarella_nand_info *)mtd->priv;
data = (u16 *)(nand_info->dmabuf + nand_info->dma_bufpos);
nand_info->dma_bufpos += 2;
return *data;
}
读一个字的处理非常简单,就是将buf(基地址)和pos(偏移)的内容按16位长度取出即可,这对应到汇编指令就是一个movw指定。
nand_scan_ident用于检测nand芯片接口是否有效。
nand_scan_tail函数的参数是一个structmtd_info类型的参数。该函数是nand_scan()第二阶段扫描函数,该函数初始化未初始化的函数指针,然后扫描坏块信息表。
chip->write_page = nand_write_page;
chip->onfi_set_features = nand_onfi_set_features;
chip->onfi_get_features = nand_onfi_get_features;
chip->ecc.read_page = nand_read_page_hwecc;
chip->ecc.write_page = nand_write_page_hwecc;
chip->ecc.read_page_raw = nand_read_page_raw;
chip->ecc.write_page_raw = nand_write_page_raw;
chip->ecc.read_oob = nand_read_oob_std;
chip->ecc.write_oob = nand_write_oob_std;
chip->ecc.read_subpage = nand_read_subpage;
chip->ecc.write_subpage = nand_write_subpage_hwecc;
mtd->type = MTD_NANDFLASH;
mtd->flags = (chip->options & NAND_ROM) ? MTD_CAP_ROM : MTD_CAP_NANDFLASH;
mtd->_erase = nand_erase;
mtd->_point = NULL;
mtd->_unpoint = NULL;
mtd->_read = nand_read;
mtd->_write = nand_write;
mtd->_panic_write = panic_nand_write;
mtd->_read_oob = nand_read_oob;
mtd->_write_oob = nand_write_oob;
mtd->_sync = nand_sync;
mtd->_lock = NULL;
mtd->_unlock = NULL;
mtd->_suspend = nand_suspend;
mtd->_resume = nand_resume;
mtd->_block_isbad = nand_block_isbad;
mtd->_block_markbad = nand_block_markbad;
mtd->writebufsize = mtd->writesize;
/* propagate ecc info to mtd_info */
mtd->ecclayout = chip->ecc.layout;
mtd->ecc_strength = chip->ecc.strength;
最后调用returnchip->scan_bbt(mtd);返回坏块表,scan_bbt实际上指向nand_default_bbt函数。
mtd_device_parse_register用于解析设备树并创建MTD分区,实际上静态设备树并没有这些分区信息,这些分区信息是在uboot下通过修改内存里的设备树添加的内容。
图1.2 mtd分区实例
NAND读写流程
接下来看mtd->_read =nand_read;这个实现的细节。
<drivers/mtd/nand/nand_base.c>
1576 static int nand_read(struct mtd_info *mtd, loff_t from, size_t len,
1577 size_t *retlen, uint8_t *buf)
1578 {
1579 struct mtd_oob_ops ops;
1580 int ret;
1581 nand_get_device(mtd, FL_READING);
1582 ops.len = len;
1583 ops.datbuf = buf;
1584 ops.oobbuf = NULL;
1585 ops.mode = MTD_OPS_PLACE_OOB;
1586 ret = nand_do_read_ops(mtd, from, &ops);
1587 *retlen = ops.retlen;
1588 nand_release_device(mtd);
1589 return ret;
1590 }
为了让说明过程更为清晰,首先来看图1.3,该图显示了nand_read在一个完整的读流程中所处在的位置。由文件系统发起的一次读可能要多次调用ubi_io_read才能完成,这也是圆形箭头存在的意义。最低层的amb_nand_read_buf是安霸SDK提供的安霸芯片读写flash方法。如果是其它厂商,则该函数的命名会有所区别。
图1.3 基于raw nand的读流程
当操作系统加载到内核以后一般是读写内存而非flash了。下面要分析的是安霸平台的nand操作方法,整个读流程的框架直接参考图1.3就可以了,实际上nand的读取细节要比框架难些,也是最有意思的部分。而amb_nand_cmdfunc则是一个nand命令分发器,类似于http的302跳转,由于该文件声明遵循GNU标准了,所以这里完全展示该函数:
<drivers/mtd/nand/ambarella_nand.c>
static void amb_nand_cmdfunc(struct mtd_info *mtd, unsigned command,
int column, int page_addr)
{
struct ambarella_nand_info *nand_info;
nand_info = (struct ambarella_nand_info *)mtd->priv;
nand_info->err_code = 0;
switch(command) {
case NAND_CMD_RESET:
nand_amb_reset(nand_info);
break;
case NAND_CMD_READID:
nand_info->dma_bufpos = 0;
nand_amb_read_id(nand_info);
break;
case NAND_CMD_STATUS:
nand_info->dma_bufpos = 0;
nand_amb_read_status(nand_info);
break;
case NAND_CMD_ERASE1:
nand_amb_erase(nand_info, page_addr);
break;
case NAND_CMD_ERASE2:
break;
case NAND_CMD_READOOB:
nand_info->dma_bufpos = column;
if (nand_info->ecc_bits > 1) {
u8 area = nand_info->soft_ecc ? MAIN_ONLY : MAIN_ECC;
nand_info->dma_bufpos = mtd->writesize;
nand_amb_read_data(nand_info, page_addr,
nand_info->dmaaddr, area);
} else {
nand_amb_read_data(nand_info, page_addr,
nand_info->dmaaddr, SPARE_ONLY);
}
break;
case NAND_CMD_READ0:
{
u8 area = nand_info->soft_ecc ? MAIN_ONLY : MAIN_ECC;
nand_info->dma_bufpos = column;
nand_amb_read_data(nand_info, page_addr, nand_info->dmaaddr, area);
if (nand_info->ecc_bits == 1)
nand_amb_read_data(nand_info, page_addr,
nand_info->dmaaddr + mtd->writesize, SPARE_ONLY);
break;
}
case NAND_CMD_SEQIN:
nand_info->dma_bufpos = column;
nand_info->seqin_column = column;
nand_info->seqin_page_addr = page_addr;
break;
case NAND_CMD_PAGEPROG:
{
u32 mn_area, sp_area, offset;
mn_area = nand_info->soft_ecc ? MAIN_ONLY : MAIN_ECC;
sp_area = nand_amb_is_hw_bch(nand_info) ? SPARE_ECC : SPARE_ONLY;
offset = (nand_info->ecc_bits > 1) ? 0 : mtd->writesize;
if (nand_info->seqin_column < mtd->writesize) {
nand_amb_write_data(nand_info,
nand_info->seqin_page_addr,
nand_info->dmaaddr, mn_area);
if (nand_info->soft_ecc && nand_info->ecc_bits == 1) {
nand_amb_write_data(nand_info,
nand_info->seqin_page_addr,
nand_info->dmaaddr + mtd->writesize,
sp_area);
}
} else {
nand_amb_write_data(nand_info,
nand_info->seqin_page_addr,
nand_info->dmaaddr + offset,
sp_area);
}
break;
}
default:
dev_err(nand_info->dev, "%s: 0x%x, %d, %d\n",
__func__, command, column, page_addr);
BUG();
break;
}
}
一个switch语句实现命令分发,这里只关心图1.3中的那个命令,
case NAND_CMD_READ0:
{
u8 area = nand_info->soft_ecc ? MAIN_ONLY : MAIN_ECC;
nand_info->dma_bufpos = column;
nand_amb_read_data(nand_info, page_addr, nand_info->dmaaddr, area);
if (nand_info->ecc_bits == 1)
nand_amb_read_data(nand_info, page_addr,
nand_info->dmaaddr + mtd->writesize, SPARE_ONLY);
break;
}
该命令的核心函数是nand_amb_read_data,该函数同样GNUlicense,照样贴出:
int nand_amb_read_data(struct ambarella_nand_info *nand_info,
u32 page_addr, dma_addr_t buf_dma, u8 area)
{
int errorCode = 0;
u32 addr_hi;
u32 addr;
u32 len;
u64 addr64;
u8 ecc = 0;
addr64 = (u64)(page_addr * nand_info->mtd.writesize);
addr_hi = (u32)(addr64 >> 32);
addr = (u32)addr64;
switch (area) {
case MAIN_ONLY:
ecc = EC_MDSD;
len = nand_info->mtd.writesize;
break;
case MAIN_ECC:
ecc = EC_MESD;
len = nand_info->mtd.writesize;
break;
case SPARE_ONLY:
ecc = EC_MDSD;
len = nand_info->mtd.oobsize;
break;
case SPARE_ECC:
ecc = EC_MDSE;
len = nand_info->mtd.oobsize;
break;
default:
dev_err(nand_info->dev, "%s: Wrong area.\n", __func__);
errorCode = -EINVAL;
goto nand_amb_read_page_exit;
break;
}
nand_info->slen = 0;
if (nand_info->ecc_bits > 1) {
/* when use BCH, the EG and EC should be 0 */
ecc = 0;
len = nand_info->mtd.writesize;
nand_info->slen = nand_info->mtd.oobsize;
nand_info->spare_buf_phys = buf_dma + len;
}
nand_info->cmd = NAND_AMB_CMD_READ;
nand_info->addr_hi = addr_hi;
nand_info->addr = addr;
nand_info->buf_phys = buf_dma;
nand_info->len = len;
nand_info->area = area;
nand_info->ecc = ecc;
errorCode = nand_amb_request(nand_info);
nand_amb_read_page_exit:
return errorCode;
}
该函数前面一直在初始化nand_info成员,最后突然调用nand_amb_request函数,可知该函数应该是比较重要的,读flash的所有信息存放在了nand_info里。amb_request函数有点长,不过还好遵循GNUlicense。
static int nand_amb_request(struct ambarella_nand_info *nand_info)
{
int errorCode = 0;
u32 cmd;
u32 nand_ctr_reg = 0;
u32 nand_cmd_reg = 0;
u32 fio_ctr_reg = 0;
long timeout;
cmd = nand_info->cmd;
switch (cmd) {
case NAND_AMB_CMD_READ:
nand_ctr_reg |= NAND_CTR_A(nand_info->addr_hi);
if (nand_amb_is_hw_bch(nand_info)) {//硬件校验ECC
/* Setup FIO DMA Control Register */
nand_amb_enable_bch(nand_info);
/* in dual space mode,enable the SE bit */
nand_ctr_reg |= NAND_CTR_SE;
/* Clean Flash_IO_ecc_rpt_status Register */
amba_writel(nand_info->regbase + FIO_ECC_RPT_STA_OFFSET, 0x0);
} else if (nand_amb_is_sw_bch(nand_info)) {//软件校验ECC
/* Setup FIO DMA Control Register */
nand_amb_enable_dsm(nand_info);
/* in dual space mode,enable the SE bit */
nand_ctr_reg |= NAND_CTR_SE;
} else {
if (nand_info->area == MAIN_ECC)
nand_ctr_reg |= (NAND_CTR_SE);
else if (nand_info->area == SPARE_ONLY ||
nand_info->area == SPARE_ECC)
nand_ctr_reg |= (NAND_CTR_SE | NAND_CTR_SA);
fio_ctr_reg = amba_readl(nand_info->regbase + FIO_CTR_OFFSET);
if (nand_info->area == SPARE_ONLY ||
nand_info->area == SPARE_ECC ||
nand_info->area == MAIN_ECC)
fio_ctr_reg |= (FIO_CTR_RS);
switch (nand_info->ecc) {
case EC_MDSE:
nand_ctr_reg |= NAND_CTR_EC_SPARE;
fio_ctr_reg |= FIO_CTR_CO;
break;
case EC_MESD:
nand_ctr_reg |= NAND_CTR_EC_MAIN;
fio_ctr_reg |= FIO_CTR_CO;
break;
case EC_MESE:
nand_ctr_reg |= (NAND_CTR_EC_MAIN | NAND_CTR_EC_SPARE);
fio_ctr_reg |= FIO_CTR_CO;
break;
case EC_MDSD:
default:
break;
}
amba_writel(nand_info->regbase + FIO_CTR_OFFSET,
fio_ctr_reg);
}
amba_writel(nand_info->regbase + FLASH_CTR_OFFSET, nand_ctr_reg);
nand_amb_setup_dma_devmem(nand_info);
break;
default:
dev_warn(nand_info->dev,
"%s: wrong command %d!\n", __func__, cmd);
errorCode = -EINVAL;
goto nand_amb_request_done;
break;
}
if (cmd == NAND_AMB_CMD_READ || cmd == NAND_AMB_CMD_PROGRAM) {
timeout = wait_event_timeout(nand_info->wq,
atomic_read(&nand_info->irq_flag) == 0x0, 1 * HZ);
if (timeout <= 0) {
errorCode = -EBUSY;
dev_err(nand_info->dev, "%s: cmd=0x%x timeout 0x%08x\n",
__func__, cmd, atomic_read(&nand_info->irq_flag));
} else {
dev_dbg(nand_info->dev, "%ld jiffies left.\n", timeout);
}
if (nand_info->dma_status & (DMA_CHANX_STA_OE | DMA_CHANX_STA_ME |
DMA_CHANX_STA_BE | DMA_CHANX_STA_RWE |
DMA_CHANX_STA_AE)) {
dev_err(nand_info->dev,
"%s: Errors happend in DMA transaction %d!\n",
__func__, nand_info->dma_status);
errorCode = -EIO;
goto nand_amb_request_done;
}
if (nand_amb_is_hw_bch(nand_info)) {
if (cmd == NAND_AMB_CMD_READ) {
if (nand_info->fio_ecc_sta & FIO_ECC_RPT_FAIL) {
int ret = 0;
/* Workaround for page never used, BCH will be failed */
if (nand_info->area == MAIN_ECC || nand_info->area == SPARE_ECC)
ret = nand_bch_spare_cmp(nand_info);
if (ret < 0) {
nand_info->mtd.ecc_stats.failed++;
dev_err(nand_info->dev,
"BCH corrected failed (0x%08x), addr is 0x[%x]!\n",
nand_info->fio_ecc_sta, nand_info->addr);
}
} else if (nand_info->fio_ecc_sta & FIO_ECC_RPT_ERR) {
nand_info->mtd.ecc_stats.corrected++;
/* once bitflip and data corrected happened, BCH will keep on
* to report bitflip in following read operations, even though
* there is no bitflip happened really. So this is a workaround
* to get it back. */
nand_amb_corrected_recovery(nand_info);
}
} else if (cmd == NAND_AMB_CMD_PROGRAM) {
if (nand_info->fio_ecc_sta & FIO_ECC_RPT_FAIL) {
dev_err(nand_info->dev,
"BCH program program failed (0x%08x)!\n",
nand_info->fio_ecc_sta);
}
}
}
if ((nand_info->fio_dma_sta & FIO_DMASTA_RE)
|| (nand_info->fio_dma_sta & FIO_DMASTA_AE)
|| !(nand_info->fio_dma_sta & FIO_DMASTA_DN)) {
u32 block_addr;
block_addr = nand_info->addr /
nand_info->mtd.erasesize *
nand_info->mtd.erasesize;
dev_err(nand_info->dev,
"%s: dma_status=0x%08x, cmd=0x%x, addr_hi=0x%x, "
"addr=0x%x, dst=0x%x, buf=0x%x, "
"len=0x%x, area=0x%x, ecc=0x%x, "
"block addr=0x%x!\n",
__func__,
nand_info->fio_dma_sta,
cmd,
nand_info->addr_hi,
nand_info->addr,
nand_info->dst,
nand_info->buf_phys,
nand_info->len,
nand_info->area,
nand_info->ecc,
block_addr);
errorCode = -EIO;
goto nand_amb_request_done;
}
}
nand_amb_request_done:
atomic_set(&nand_info->irq_flag, 0x7);
nand_info->dma_status = 0;
/* Avoid to flush previous error info */
if (nand_info->err_code == 0)
nand_info->err_code = errorCode;
if ((nand_info->nand_wp) &&
(cmd == NAND_AMB_CMD_ERASE || cmd == NAND_AMB_CMD_COPYBACK ||
cmd == NAND_AMB_CMD_PROGRAM || cmd == NAND_AMB_CMD_READSTATUS)) {
nand_ctr_reg |= NAND_CTR_WP;
amba_writel(nand_info->regbase + FLASH_CTR_OFFSET, nand_ctr_reg);
}
if ((cmd == NAND_AMB_CMD_READ || cmd == NAND_AMB_CMD_PROGRAM)
&& nand_amb_is_hw_bch(nand_info))
nand_amb_disable_bch(nand_info);
fio_unlock(SELECT_FIO_FL);
nand_amb_request_exit:
return errorCode;
}
后面就是读写nand控制寄存器了,这里需要说明的是这里读方式,一次会读会使用DMA方式读取一个页的内容,而不是一个字节一个字节读。
MTD子系统
MTD(memorytechnology device内存技术设备)用于访问memory设备(ROM、flash)的linux子系统。在Linux中其被作为一种类型的设备文件以访问flash,该层用于在特定的flash芯片驱动层和上层硬件之间提供一层抽象。
MTD层的代码在drivers/mtd目录下。该目录下的Makefile文件内容如下:
图2.1 MTDMakefile
如果Makemenuconfig将MTD层选中(M或者Y)如下图:
图2.2 MTD编译配置选项
上述读出的信息源于各个Kconfig文件。
obj-$(CONFIG_MTD) += mtd.o这行意味着将这个目录下的文件编译成mtd.o这个目标,在后续链接内核映像时会使用到这个目标文件。
mtd-y哪行时必须编译的,不论有没有使用这一选项,mtdcore.o使用了Makefile的隐式规则,其由mtdcore.c文件生成,其它的依次类推。这一行代表着mtd抽象层(也有进一步将其分为MTD原始层和MTD设备层)。
obj-y这一行指定了目录,将会进入这个目录下去寻找目标并编译,这行的内容可以看成是芯片级驱动。遵循CFI(Common flash interface,Intel发起的NOR flash标准接口)驱动位于chips目录下,nand指的是NAND型flash驱动所在的目录。Maps子目录存放特定flash数据。
最后由于涉及的UBIFS,所以这里最后一行的UBI也是需要看看的。
《UBIFS文件系统》一文中的读代码流程图显示了mtd_read是其读过程的一环。在上一节中遇到过该函数,mtd->_read = nand_read; mtd->_write = nand_write;只是那里没有展开。
图1.2所示的那些分区,每一个分区对应一个一个struct mtd_info的结构体。
struct mtd_info {
/**
*可选类型
#define MTD_ABSENT 0
#define MTD_RAM 1
#define MTD_ROM 2
#define MTD_NORFLASH 3
#define MTD_NANDFLASH 4
#define MTD_DATAFLASH 6
#define MTD_UBIVOLUME 7
#define MTD_MLCNANDFLASH 8
*/
u_char type;
/**可选flag
#define MTD_WRITEABLE 0x400 /* Device is writeable */
#define MTD_BIT_WRITEABLE 0x800 /* Single bits can be flipped */
#define MTD_NO_ERASE 0x1000 /* No erase necessary */
#define MTD_POWERUP_LOCK 0x2000 /* Always locked after reset */
#define MTD_CAP_ROM 0
#define MTD_CAP_RAM (MTD_WRITEABLE | MTD_BIT_WRITEABLE | MTD_NO_ERASE)
#define MTD_CAP_NORFLASH (MTD_WRITEABLE | MTD_BIT_WRITEABLE)
#define MTD_CAP_NANDFLASH (MTD_WRITEABLE)
*/
uint32_t flags;
uint64_t size; // Total size of the MTD
/* "Major" erase size for the device. Na茂ve users may take this
* to be the only erase size available, or may use the more detailed
* information below if they desire
*/
uint32_t erasesize;
/* Minimal writable flash unit size. In case of NOR flash it is 1 (even
* though individual bits can be cleared), in case of NAND flash it is
* one NAND page (or half, or one-fourths of it), in case of ECC-ed NOR
* it is of ECC block size, etc. It is illegal to have writesize = 0.
* Any driver registering a struct mtd_info must ensure a writesize of
* 1 or larger.
*/
uint32_t writesize;
/*
* Size of the write buffer used by the MTD. MTD devices having a write
* buffer can write multiple writesize chunks at a time. E.g. while
* writing 4 * writesize bytes to a device with 2 * writesize bytes
* buffer the MTD driver can (but doesn't have to) do 2 writesize
* operations, but not 4. Currently, all NANDs have writebufsize
* equivalent to writesize (NAND page size). Some NOR flashes do have
* writebufsize greater than writesize.
*/
uint32_t writebufsize;
uint32_t oobsize; // Amount of OOB data per block (e.g. 16)
uint32_t oobavail; // Available OOB bytes per block
/*
* If erasesize is a power of 2 then the shift is stored in
* erasesize_shift otherwise erasesize_shift is zero. Ditto writesize.
*/
unsigned int erasesize_shift;
unsigned int writesize_shift;
/* Masks based on erasesize_shift and writesize_shift */
unsigned int erasesize_mask;
unsigned int writesize_mask;
int (*_erase) (struct mtd_info *mtd, struct erase_info *instr);
int (*_point) (struct mtd_info *mtd, loff_t from, size_t len,
size_t *retlen, void **virt, resource_size_t *phys);
int (*_unpoint) (struct mtd_info *mtd, loff_t from, size_t len);
unsigned long (*_get_unmapped_area) (struct mtd_info *mtd,
unsigned long len,
unsigned long offset,
unsigned long flags);
int (*_read) (struct mtd_info *mtd, loff_t from, size_t len,
size_t *retlen, u_char *buf);
int (*_write) (struct mtd_info *mtd, loff_t to, size_t len,
size_t *retlen, const u_char *buf);
int (*_panic_write) (struct mtd_info *mtd, loff_t to, size_t len,
size_t *retlen, const u_char *buf);
int (*_read_oob) (struct mtd_info *mtd, loff_t from,
struct mtd_oob_ops *ops);
int (*_write_oob) (struct mtd_info *mtd, loff_t to,
struct mtd_oob_ops *ops);
int (*_get_fact_prot_info) (struct mtd_info *mtd, struct otp_info *buf,
size_t len);
};
这里再回到上一节的图1.1的mtd_device_parse_register函数来。该函数在这里的调用代码如下:
ppdata.of_node =pdev->dev.of_node;//设备树节点信息包含的是MTD分区信息
mtd_device_parse_register(mtd,NULL, &ppdata, NULL, 0);
该函数首先调用parse_mtd_partitions解析设备树得到MTD分区信息。
int mtd_device_parse_register(struct mtd_info *mtd, const char * const *types,
struct mtd_part_parser_data *parser_data,
const struct mtd_partition *parts,
int nr_parts)
{
int err;
struct mtd_partition *real_parts;
//该函数将解析得到的分许信息存放在real_parts里
err = parse_mtd_partitions(mtd, NULL, &real_parts, &ppdata);
if (err > 0) {
err = add_mtd_partitions(mtd, real_parts, err);
kfree(real_parts);
} else if (err == 0) {
err = add_mtd_device(mtd);
if (err == 1)
err = -ENODEV;
}
return err;
}
parse_mtd_partitions函数主要是要找到解析的函数,该类函数由part_parsers串接在一起,通过匹配名称的方式,这些名称在不显示指定的情况下默认有cmdlinepart和ofpart两种,解析函数是由register_mtd_parser注册的。通过这两种默认的解析名称可以看出cmdlinepart方法使用于命令行方式解析MTD分区信息,而ofpart适用于设备树解析解析MTD分区信息,这里走的是设备树方式,所以可能看到类似下面的输出:
16 ofpart partitions found on MTD device amba_nand
如果返回值大于0,则说明解析到有MTD分区需要创建,这是会调用add_mtd_partitions将分区新添加到MTD设备里。如下给出了我的嵌入式平台633行打印的信息:
Creating 16 MTD partitions on "amba_nand":
<drivers/mtd/mtdpart.c>
625 int add_mtd_partitions(struct mtd_info *master,
626 const struct mtd_partition *parts,
627 int nbparts)
628 {
629 struct mtd_part *slave;
630 uint64_t cur_offset = 0;
631 int i;
632
633 printk(KERN_NOTICE "Creating %d MTD partitions on \"%s\":\n", nbparts, master->name);
634
635 for (i = 0; i < nbparts; i++) {
636 slave = allocate_partition(master, parts + i, i, cur_offset);
637 if (IS_ERR(slave))
638 return PTR_ERR(slave);
639
640 mutex_lock(&mtd_partitions_mutex);
641 list_add(&slave->list, &mtd_partitions);
642 mutex_unlock(&mtd_partitions_mutex);
643
644 add_mtd_device(&slave->mtd);
645
646 cur_offset = slave->offset + slave->mtd.size;
647 }
648
649 return 0;
650 }
该函数的核心工作由allocate_partition和add_mtd_device这两个函数完成,linux下函数命名真的非常贴切,这里根据函数名就可以知道意义,首先根据解析设备树得到的分区信息创建MTD分区,然后将这个创建好的分区注册。
由于不同的MTD分区均存在于同一块flash上面,所以allocate_partition创建的新分区一些参数可由master参数继承,这就会看到如下的代码:
slave->mtd.type = master->type;
slave->mtd.flags = master->flags & ~part->mask_flags;
slave->mtd.size = part->size;
slave->mtd.writesize = master->writesize;
slave->mtd.writebufsize = master->writebufsize;
slave->mtd.oobsize = master->oobsize;
slave->mtd.oobavail = master->oobavail;
slave->mtd.subpage_sft = master->subpage_sft;
slave->mtd.name = name;
slave->mtd.owner = master->owner;
slave->mtd.backing_dev_info = master->backing_dev_info;
/* NOTE: we don't arrange MTDs as a tree; it'd be error-prone
* to have the same data be in two different partitions.
*/
slave->mtd.dev.parent = master->dev.parent;
if (master->_get_unmapped_area)
slave->mtd._get_unmapped_area = part_get_unmapped_area;
if (master->_read_oob)
slave->mtd._read_oob = part_read_oob;
if (master->_write_oob)
…
除了上述显示的继承了父分区参数外,还有隐式继承方式,比如如下的代码:
slave->mtd._read = part_read;
slave->mtd._write = part_write;
slave->mtd._erase = part_erase;
这些方式实际上是对父方法的封装,part_read中的读的关键代码就可以证明这一点。
part->master->_read(part->master, from + part->offset, len, retlen, buf);
随后就是做一些安全性检查,这包括新分区的偏移地址是否超出flash容量,偏移地址加分区大小是否超出flash容量等,最后根据传递进来的flag参数对新分区的mtd分区flag参数进行适当的设置。
641行将新创建的分区链接到mtd_partitions链表上,所有的MTD分区都将串接到这个链表上。
644行添加一个MTD分区类型的设备。其参数是allocate_partition创建的新分区。该函数首先根据MTD类型(RAM、ROM…)设置其bdi(backing dev info)信息,这些信息就是对设备操作的权限设置。
if (!mtd->backing_dev_info) {
switch (mtd->type) {
case MTD_RAM:
mtd->backing_dev_info = &mtd_bdi_rw_mappable;
break;
case MTD_ROM:
mtd->backing_dev_info = &mtd_bdi_ro_mappable;
break;
default:
mtd->backing_dev_info = &mtd_bdi_unmappable;
break;
}
}
接下来需要申请一个idr(integerID management,小整形ID数集合),这个数用来索引MTD分区。所以会把这个数存放在mtd的索引字段。
mtd->index = i;
接下来设置设备的信息,这些信息包括设备类型、设备所述的类、设备号、设备名称等:
mtd->dev.type = &mtd_devtype;
mtd->dev.class = &mtd_class;
mtd->dev.devt = MTD_DEVT(i);
dev_set_name(&mtd->dev, "mtd%d", i);
dev_set_drvdata(&mtd->dev, mtd);
完成上述操作后还需要注册设备,这些都是标准的接口流程了。
if (device_register(&mtd->dev) != 0)
goto fail_added;
注册完成后还需要创建这个设备类:
device_create(&mtd_class, mtd->dev.parent,
MTD_DEVT(i) + 1,
NULL, "mtd%dro", i);
最后将这一事件添加到通知链上去,这一组件也是内核的标准组件,网络中很多需要异步通知的都采用了这种方式。
list_for_each_entry(not, &mtd_notifiers, list)
not->add(mtd);