Linux: ubi rootfs 故障案例 (1)

1. 前言

限于作者能力水平,本文可能存在谬误,因此而给读者带来的损失,作者不做任何承诺。

2. ubi rootfs 故障现场

问题故障内核日志如下:

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 4.19.94-g1194fe2-dirty (bill@bill-virtual-machine) (gcc version 5.3.1 20160113 (Linaro GCC 5.3-2016.02)) #21 PREEMPT Tue Jun 4 10:18:44 CST 2024
[    0.000000] CPU: ARMv7 Processor [413fc082] revision 2 (ARMv7), cr=10c5387d
......
[    0.000000] Kernel command line: console=ttyO0,115200n8 root=ubi0:rootfs rw ubi.mtd=NAND.rootfs,2048 rootfstype=ubifs rootwait=1
......
[    1.713970] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xda
[    1.720358] nand: Micron MT29F2G08AAD
[    1.724091] nand: 256 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64
[    1.731736] nand: using OMAP_ECC_BCH8_CODE_HW ECC scheme
[    1.737188] 11 fixed-partitions partitions found on MTD device omap2-nand.0
[    1.744196] Creating 11 MTD partitions on "omap2-nand.0":
[    1.749624] 0x000000000000-0x000000020000 : "NAND.SPL"
[    1.755917] 0x000000020000-0x000000040000 : "NAND.SPL.backup1"
[    1.762654] 0x000000040000-0x000000060000 : "NAND.SPL.backup2"
[    1.769446] 0x000000060000-0x000000080000 : "NAND.SPL.backup3"
[    1.776214] 0x000000080000-0x0000000c0000 : "NAND.u-boot-spl-os"
[    1.783272] 0x0000000c0000-0x0000001c0000 : "NAND.u-boot"
[    1.790358] 0x0000001c0000-0x0000001e0000 : "NAND.u-boot-env"
[    1.797050] 0x0000001e0000-0x000000200000 : "NAND.u-boot-env.backup1"
[    1.804438] 0x000000200000-0x000000a00000 : "NAND.kernel"
[    1.818114] 0x000000a00000-0x00000e000000 : "NAND.rootfs"
[    2.024110] 0x00000e000000-0x000010000000 : "NAND.userdata"
......
[    2.162435] ubi0: attaching mtd9
[    2.166572] ubi0 error: validate_ec_hdr: bad VID header offset 512, expected 2048
[    2.174146] ubi0 error: validate_ec_hdr: bad EC header
[    2.179304] Erase counter header dump:
[    2.183118]  magic          0x55424923
[    2.186881]  version        1
[    2.189856]  ec             0
[    2.192829]  vid_hdr_offset 512
[    2.195994]  data_offset    2048
[    2.199232]  image_seq      2007489760
[    2.203004]  hdr_crc        0xbe9cfce9
[    2.206763] erase counter header hexdump:
[    2.210810] CPU: 0 PID: 1 Comm: swapper Not tainted 4.19.94-g1194fe2-dirty #21
[    2.218072] Hardware name: Generic AM33XX (Flattened Device Tree)
[    2.224199] Backtrace: 
[    2.226668] [<c010bfe4>] (dump_backtrace) from [<c010c2b4>] (show_stack+0x18/0x1c)
[    2.234283]  r7:00000000 r6:00000000 r5:cf04c000 r4:cf675c00
[    2.239970] [<c010c29c>] (show_stack) from [<c09531b4>] (dump_stack+0x24/0x28)
[    2.247237] [<c0953190>] (dump_stack) from [<c064da18>] (validate_ec_hdr+0xa0/0xe4)
[    2.254942] [<c064d978>] (validate_ec_hdr) from [<c064e600>] (ubi_io_read_ec_hdr+0x1b4/0x204)
[    2.263546]  r7:cf04c000 r6:55424923 r5:cf675c00 r4:00000000
[    2.269233] [<c064e44c>] (ubi_io_read_ec_hdr) from [<c0653a40>] (ubi_attach+0x1b8/0x1464)
[    2.277464]  r10:cf76f000 r9:00000000 r8:00000000 r7:cf675c00 r6:cf04c000 r5:cf734a00
[    2.285336]  r4:cf736240
[    2.287884] [<c0653888>] (ubi_attach) from [<c0647f78>] (ubi_attach_mtd_dev+0x42c/0xbc4)
[    2.296023]  r10:00020000 r9:cf721400 r8:c0e03048 r7:00000000 r6:cf721400 r5:cf04c000
[    2.303893]  r4:0000103f
[    2.306445] [<c0647b4c>] (ubi_attach_mtd_dev) from [<c0d26378>] (ubi_init+0x184/0x22c)
[    2.314410]  r10:c0e370cc r9:c0c2a8e8 r8:c0c2a8bc r7:c0e85e64 r6:cf721400 r5:c0e85e68
[    2.322270]  r4:00000000
[    2.324830] [<c0d261f4>] (ubi_init) from [<c01026ac>] (do_one_initcall+0x5c/0x1a4)
[    2.332435]  r10:00000008 r9:c0e03048 r8:00000000 r7:c0d261f4 r6:ffffe000 r5:c0e4f140
[    2.340306]  r4:c0e4f140
[    2.342859] [<c0102650>] (do_one_initcall) from [<c0d00f34>] (kernel_init_freeable+0x13c/0x1d4)
[    2.351610]  r9:c0d00620 r8:000000f8 r7:c0d3e834 r6:c0d51d64 r5:c0e4f140 r4:c0e4f140
[    2.359404] [<c0d00df8>] (kernel_init_freeable) from [<c09689d8>] (kernel_init+0x10/0x118)
[    2.367716]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c09689c8
[    2.375587]  r4:00000000
[    2.378132] [<c09689c8>] (kernel_init) from [<c01010e8>] (ret_from_fork+0x14/0x2c)
[    2.385743] Exception stack(0xcf051fb0 to 0xcf051ff8)
[    2.390816] 1fa0:                                     00000000 00000000 00000000 00000000
[    2.399040] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    2.407263] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    2.413914]  r5:c09689c8 r4:00000000
[    2.417506] ubi0 error: ubi_io_read_ec_hdr: validation failed for PEB 0
[    2.424265] ubi0 error: ubi_attach_mtd_dev: failed to attach mtd9, error -22
[    2.431373] UBI error: cannot attach mtd9
[    2.436450] input: volume_keys@0 as /devices/platform/volume_keys@0/input/input0
[    2.444593] omap_rtc 44e3e000.rtc: setting system clock to 2000-01-01 00:00:00 UTC (946684800)
[    2.453894] ALSA device list:
[    2.456888]   #0: crt_audio_bus
[    2.460684] VFS: Cannot open root device "ubi0:rootfs" or unknown-block(0,0): error -19
[    2.468832] Please append a correct "root=" boot option; here are the available partitions:
[    2.477272] 0100           65536 ram0 
[    2.477276]  (driver?)
[    2.483432] 0101           65536 ram1 
[    2.483435]  (driver?)
[    2.489561] 0102           65536 ram2 
[    2.489563]  (driver?)
[    2.495704] 0103           65536 ram3 
[    2.495706]  (driver?)
[    2.501830] 0104           65536 ram4 
[    2.501832]  (driver?)
[    2.507970] 0105           65536 ram5 
[    2.507972]  (driver?)
[    2.514109] 0106           65536 ram6 
[    2.514111]  (driver?)
[    2.520236] 0107           65536 ram7 
[    2.520238]  (driver?)
[    2.526375] 0108           65536 ram8 
[    2.526378]  (driver?)
[    2.532502] 0109           65536 ram9 
[    2.532504]  (driver?)
[    2.538642] 010a           65536 ram10 
[    2.538645]  (driver?)
[    2.544868] 010b           65536 ram11 
[    2.544870]  (driver?)
[    2.551083] 010c           65536 ram12 
[    2.551085]  (driver?)
[    2.557309] 010d           65536 ram13 
[    2.557311]  (driver?)
[    2.563533] 010e           65536 ram14 
[    2.563536]  (driver?)
[    2.569748] 010f           65536 ram15 
[    2.569751]  (driver?)
[    2.575982] 1f00             128 mtdblock0 
[    2.575984]  (driver?)
[    2.582546] 1f01             128 mtdblock1 
[    2.582548]  (driver?)
[    2.589122] 1f02             128 mtdblock2 
[    2.589125]  (driver?)
[    2.595704] 1f03             128 mtdblock3 
[    2.595706]  (driver?)
[    2.602266] 1f04             256 mtdblock4 
[    2.602268]  (driver?)
[    2.608841] 1f05            1024 mtdblock5 
[    2.608844]  (driver?)
[    2.615416] 1f06             128 mtdblock6 
[    2.615418]  (driver?)
[    2.621980] 1f07             128 mtdblock7 
[    2.621983]  (driver?)
[    2.628556] 1f08            8192 mtdblock8 
[    2.628559]  (driver?)
[    2.635130] 1f09          219136 mtdblock9 
[    2.635133]  (driver?)
[    2.641695] 1f0a           32768 mtdblock10 
[    2.641697]  (driver?)
[    2.648358] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    2.656667] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

3. 故障分析与解决

内核日志信息:

[    2.162435] ubi0: attaching mtd9
[    2.166572] ubi0 error: validate_ec_hdr: bad VID header offset 512, expected 2048
[    2.174146] ubi0 error: validate_ec_hdr: bad EC header

结合前面的 NAND 分区日志分析,可以知道,mtd9 对应分区 "NAND.rootfs",所以实在挂载 rootfs 过程中出错了。通过内核导出的出错时的调用栈信息,定位到出错代码路径如下(内核版本为 4.19.94):

kernel_init()
	kernel_init_freeable()
		do_basic_setup()
			do_initcalls()
				...
				do_one_initcall()
					ubi_init()
/* drivers/mtd/ubi/build.c */
static int __init ubi_init(void)
{
	...
	/* Attach MTD devices */
	/* mtd_dev_param[] 和 mtd_devs 的设置过程,见后文的 ubi_mtd_param_parse() 分析 */
	for (i = 0; i < mtd_devs; i++) {
		struct mtd_dev_param *p = &mtd_dev_param[i];
		struct mtd_info *mtd;
  
  		...
  		mtd = open_mtd_device(p->name);
  		...

		mutex_lock(&ubi_devices_mutex);
		err = ubi_attach_mtd_dev(mtd, p->ubi_num,
				p->vid_hdr_offs, p->max_beb_per1024);
		mutex_unlock(&ubi_devices_mutex);
		...
	}
	...
}

int ubi_attach_mtd_dev(struct mtd_info *mtd, int ubi_num,
		int vid_hdr_offset, int max_beb_per1024)
{
	struct ubi_device *ubi;
	...

	...
	
	ubi = kzalloc(sizeof(struct ubi_device), GFP_KERNEL);
	
	...
	
	ubi->mtd = mtd;
	ubi->ubi_num = ubi_num;
	ubi->vid_hdr_offset = vid_hdr_offset;
	ubi->autoresize_vol_id = -1;

	...

	err = io_init(ubi, max_beb_per1024);

	...

	err = ubi_attach(ubi, 0);
	...

	/* Make device "available" before it becomes accessible via sysfs */
	ubi_devices[ubi_num] = ubi;

	...
}

static int io_init(struct ubi_device *ubi, int max_beb_per1024)
{
	...
	ubi->peb_size   = ubi->mtd->erasesize;
	ubi->peb_count  = mtd_div_by_eb(ubi->mtd->size, ubi->mtd);
	ubi->flash_size = ubi->mtd->size;
	...
	ubi->leb_size = ubi->peb_size - ubi->leb_start; /* (2) */
	...
}

/* drivers/mtd/ubi/attach.c */
int ubi_attach(struct ubi_device *ubi, int force_scan)
{
	...
	err = scan_all(ubi, ai, 0);
	...
}

static int scan_all(struct ubi_device *ubi, struct ubi_attach_info *ai,
		int start)
{
	...
	for (pnum = start; pnum < ubi->peb_count; pnum++) {
		...
		err = scan_peb(ubi, ai, pnum, false);
		...
	}
	...
}

static int scan_peb(struct ubi_device *ubi, struct ubi_attach_info *ai,
		int pnum, bool fast)
{
	...
	err = ubi_io_read_ec_hdr(ubi, pnum, ech, 0);
	...
}

/* drivers/mtd/ubi/io.c */
int ubi_io_read_ec_hdr(struct ubi_device *ubi, int pnum,
         struct ubi_ec_hdr *ec_hdr, int verbose)
{
	...
	/* And of course validate what has just been read from the media */
	err = validate_ec_hdr(ubi, ec_hdr);
	...
}	

static int validate_ec_hdr(const struct ubi_device *ubi,
		const struct ubi_ec_hdr *ec_hdr)		
{
	...
	int vid_hdr_offset, leb_start;
	
	...
	vid_hdr_offset = be32_to_cpu(ec_hdr->vid_hdr_offset);
	...

	/* (1) */
	if (vid_hdr_offset != ubi->vid_hdr_offset) {
		ubi_err(ubi, "bad VID header offset %d, expected %d",
			vid_hdr_offset, ubi->vid_hdr_offset);
		goto bad;
	}
	
	...
bad:
	ubi_err(ubi, "bad EC header");
	ubi_dump_ec_hdr(ec_hdr);
	dump_stack();
	return 1;
}

问题出在 validate_ec_hdr() 函数位置 (1) 处,由于 vid_hdr_offsetubi->vid_hdr_offset 不相等导致。vid_hdr_offset 来自 ec_hdr->vid_hdr_offset;从上面分析的代码分析,进一步得知 ec_hdr->vid_hdr_offset 来自于 ubi_io_read_ec_hdr() 从设备读取的信息,这个信息是后文提到的 ubinize 工具将 rootfs.ubifs 打包到 rootfs.ubi 镜像是插入的 UBI 卷管理信息。这里暂时不细表,留待后文分析。另外一个信息 ubi->vid_hdr_offset 来自内核命令行参数 ubi.mtd=NAND.rootfs,2048,其赋值的代码流程如下:

start_kernel()
	after_dashes = parse_args("Booting kernel",
			static_command_line, __start___param,
			__stop___param - __start___param,
			-1, -1, NULL, &unknown_bootoption);
		...
		ubi_mtd_param_parse()
/* drivers/mtd/ubi/build.c */
/* 
 * 本文中用来解析 ubi.mtd=NAND.rootfs,2048
 * @val: "NAND.rootfs,2048"
 */
static int ubi_mtd_param_parse(const char *val, const struct kernel_param *kp)
{
	...
	p = &mtd_dev_param[mtd_devs];
 	strcpy(&p->name[0], tokens[0]); /* @p->name: "NAND.rootfs" */

	token = tokens[1];
	if (token) {
		p->vid_hdr_offs = bytes_str_to_int(token); /* @p->vid_hdr_offs: 2048 */
		...
	}

	...
	
	mtd_devs += 1;
	return 0;
}

/*
 * 定义解析 ubi.mtd=XXX 的接口 ubi_mtd_param_parse(), 
 * 如用来解析 ubi.mtd=NAND.rootfs,2048 。
 */
module_param_call(mtd, ubi_mtd_param_parse, NULL, NULL, 0400);

ubi->vid_hdr_offset 的设置过程在 ubi_init() 调用之前完成。说完了 ubi->vid_hdr_offset 的设置过程,继续看在 UBI 根文件系统 rootfs.ubi 的构建过程中,对 ec_hdr->vid_hdr_offset 等卷(volume)管理信息的填充过程。本文的UBI 根文件系统镜像 rootfs.ubi通过 buildroot 工具构建,其过程简单来讲,就是先通过 mkfs.ubifs 生成一个 rootfs.ubifs 文件,然后再通过工具 ubinizerootfs.ubifs 打包成 UBI 根文件系统镜像 rootfs.ubi

                mkfs.ubifs               ubinize
根文件系统目录树 ----------> rootfs.ubifs --------> rootfs.ubi

mkfs.ubifs 的构建的 rootfs.ubifs 文件,可以理解为根文件系统目录树的打包,是一个文件系统镜像;而 ubinize 工具将 rootfs.ubifs 打包为 UBI 根文件系统镜像 rootfs.ubi 文件时,增加了包含 struct ubi_ec_hdr 头部信息等 UBI 卷管理信息rootfs.ubifs 无法直接作为烧录进设备分区的镜像,只有包含了 UBI 卷管理信息rootfs.ubi 才能烧录进设备分区,作为系统的根文件系统来启动。前面内核日志报错信息:

[    2.166572] ubi0 error: validate_ec_hdr: bad VID header offset 512, expected 2048

问题的根本原因在于:内核命令行参数 ubi.mtd=NAND.rootfs,2048 中的 2048rootfs.ubifs 打包头部信息 struct ubi_ec_hdr::vid_hdr_offset512(ubinize 工具给的默认值) 不匹配造成的。ubinize 工具的 -O 选项可以指定 struct ubi_ec_hdr::vid_hdr_offset 值。从 buildroot 工具 ubinize 打包过程文件 fs/ubifs/ubi.mk 片段:

...
UBI_UBINIZE_OPTS += $(call qstrip,$(BR2_TARGET_ROOTFS_UBI_OPTS))
...
define ROOTFS_UBI_CMD
        sed 's;BR2_ROOTFS_UBIFS_PATH;$@fs;' \
                $(UBINIZE_CONFIG_FILE_PATH) > $(BUILD_DIR)/ubinize.cfg
        $(HOST_DIR)/usr/sbin/ubinize -o $@ $(UBI_UBINIZE_OPTS) $(BUILD_DIR)/ubinize.cfg
        rm $(BUILD_DIR)/ubinize.cfg
endef

得知可以通过 BR2_TARGET_ROOTFS_UBI_OPTS 配置给 ubinize 工具传递参数,运行 make menuconfig ,按如下修改 buildroot 配置 BR2_TARGET_ROOTFS_UBI_OPTS,给 ubinize 工具传递 -O 1024 参数,修改 打包头部信息 struct ubi_ec_hdr::vid_hdr_offset 值为 1024
在这里插入图片描述
重新编译生成 rootfs.ubi,烧录并启动运行,看问题是否解决了:

[    2.162404] ubi0: attaching mtd9
[    2.817587] ubi0: scanning is finished
[    2.841285] ubi0: volume 0 ("rootfs") re-sized from 83 to 1668 LEBs
[    2.848341] ubi0: attached mtd9 (name "NAND.rootfs", size 214 MiB)
[    2.854613] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    2.861516] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 512
[    2.868259] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
[    2.875260] ubi0: good PEBs: 1711, bad PEBs: 1, corrupted PEBs: 0
[    2.881377] ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
[    2.888641] ubi0: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 425578287
[    2.897735] ubi0: available PEBs: 0, total reserved PEBs: 1711, PEBs reserved for bad PEB handling: 39
[    2.907100] ubi0: background thread "ubi_bgt0d" started, PID 65
......
[    2.965365] UBIFS error (ubi0:0 pid 1): ubifs_read_superblock: LEB size mismatch: 129024 in superblock, 126976 real
[    2.992980] UBIFS error (ubi0:0 pid 1): ubifs_read_superblock: bad superblock, error 1
[    3.000933]  magic          0x6101831
[    3.004623]  crc            0x3375ce2d
[    3.008386]  node_type      6 (superblock node)
[    3.012931]  group_type     0 (no node group)
[    3.017314]  sqnum          1
[    3.020289]  len            4096
[    3.023537]  key_hash       0 (R5)
[    3.026950]  key_fmt        0 (simple)
[    3.030709]  flags          0x0
[    3.033870]  big_lpt        0
[    3.036845]  space_fixup    0
[    3.039818]  min_io_size    2048
[    3.043063]  leb_size       129024
[    3.046474]  leb_cnt        1668
[    3.049709]  max_leb_cnt    2048
[    3.052956]  max_bud_bytes  8388608
[    3.056454]  log_lebs       5
[    3.059429]  lpt_lebs       2
[    3.062403]  orph_lebs      1
[    3.065387]  jhead_cnt      1
[    3.068362]  fanout         8
[    3.071335]  lsave_cnt      256
[    3.074495]  default_compr  0
[    3.077470]  rp_size        0
[    3.080443]  rp_uid         0
[    3.083427]  rp_gid         0
[    3.086402]  fmt_version    4
[    3.089378]  time_gran      1000000000
[    3.093151]  UUID           9FC73AE3-A0BA-41C8-8615-2B75694BB8CD
[    3.204172] UBIFS error (ubi0:0 pid 1): ubifs_read_superblock: LEB size mismatch: 129024 in superblock, 126976 real
[    3.222987] UBIFS error (ubi0:0 pid 1): ubifs_read_superblock: bad superblock, error 1
[    3.230940]  magic          0x6101831
[    3.234647]  crc            0x3375ce2d
[    3.238408]  node_type      6 (superblock node)
[    3.242993]  group_type     0 (no node group)
[    3.247364]  sqnum          1
[    3.250338]  len            4096
[    3.253591]  key_hash       0 (R5)
[    3.257003]  key_fmt        0 (simple)
[    3.260762]  flags          0x0
[    3.263923]  big_lpt        0
[    3.266897]  space_fixup    0
[    3.269872]  min_io_size    2048
[    3.273117]  leb_size       129024
[    3.276528]  leb_cnt        1668
[    3.279765]  max_leb_cnt    2048
[    3.283011]  max_bud_bytes  8388608
[    3.286509]  log_lebs       5
[    3.289484]  lpt_lebs       2
[    3.292458]  orph_lebs      1
[    3.295441]  jhead_cnt      1
[    3.298417]  fanout         8
[    3.301390]  lsave_cnt      256
[    3.304548]  default_compr  0
[    3.307522]  rp_size        0
[    3.310496]  rp_uid         0
[    3.313479]  rp_gid         0
[    3.316454]  fmt_version    4
[    3.319429]  time_gran      1000000000
[    3.323199]  UUID           9FC73AE3-A0BA-41C8-8615-2B75694BB8CD
[    3.433193] List of all partitions:
[    3.436713] 0100           65536 ram0 
[    3.436716]  (driver?)
[    3.442843] 0101           65536 ram1 
[    3.442845]  (driver?)
[    3.463000] 0102           65536 ram2 
[    3.463004]  (driver?)
[    3.469132] 0103           65536 ram3 
[    3.469135]  (driver?)
[    3.492961] 0104           65536 ram4 
[    3.492964]  (driver?)
[    3.499093] 0105           65536 ram5 
[    3.499095]  (driver?)
[    3.512959] 0106           65536 ram6 
[    3.512961]  (driver?)
[    3.519087] 0107           65536 ram7 
[    3.519090]  (driver?)
[    3.542960] 0108           65536 ram8 
[    3.542962]  (driver?)
[    3.549089] 0109           65536 ram9 
[    3.549091]  (driver?)
[    3.562956] 010a           65536 ram10 
[    3.562959]  (driver?)
[    3.569173] 010b           65536 ram11 
[    3.569175]  (driver?)
[    3.582961] 010c           65536 ram12 
[    3.582964]  (driver?)
[    3.589178] 010d           65536 ram13 
[    3.589180]  (driver?)
[    3.612959] 010e           65536 ram14 
[    3.612961]  (driver?)
[    3.619175] 010f           65536 ram15 
[    3.619177]  (driver?)
[    3.632970] 1f00             128 mtdblock0 
[    3.632974]  (driver?)
[    3.639537] 1f01             128 mtdblock1 
[    3.639540]  (driver?)
[    3.662959] 1f02             128 mtdblock2 
[    3.662962]  (driver?)
[    3.669524] 1f03             128 mtdblock3 
[    3.669527]  (driver?)
[    3.682960] 1f04             256 mtdblock4 
[    3.682963]  (driver?)
[    3.689526] 1f05            1024 mtdblock5 
[    3.689529]  (driver?)
[    3.712959] 1f06             128 mtdblock6 
[    3.712962]  (driver?)
[    3.719526] 1f07             128 mtdblock7 
[    3.719528]  (driver?)
[    3.732960] 1f08            8192 mtdblock8 
[    3.732963]  (driver?)
[    3.739526] 1f09          219136 mtdblock9 
[    3.739529]  (driver?)
[    3.762959] 1f0a           32768 mtdblock10 
[    3.762962]  (driver?)
[    3.769608] No filesystem could mount root, tried: 
[    3.769610]  ubifs
[    3.782955] 
[    3.786470] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    3.794779] ---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0) ]---

从内核日志看到,之前的问题没有了,但又有了新的问题。根据内核日志,分析下代码流程(对代码细节不感兴趣的读者,可以直接跳过):

/* 1. 解析内核命令行参数: root=ubi0:rootfs rw rootfstype=ubifs rootwait=1 */
start_kernel()
	after_dashes = parse_args("Booting kernel",
			static_command_line, __start___param,
			__stop___param - __start___param,
			-1, -1, NULL, &unknown_bootoption);
		...
		unknown_bootoption()
			obsolete_checksetup(param)
				p->setup_func(line + n) = root_dev_setup(), rootwait_setup(), fs_names_setup()

/* init/do_mounts.c */
int root_mountflags = MS_RDONLY | MS_SILENT;

static int __init readwrite(char *str)
{
	if (*str)
		return 0;
	root_mountflags &= ~MS_RDONLY; /* @root_mountflags: MS_SILENT */
	return 1;
}

__setup("rw", readwrite);

static int __init root_dev_setup(char *line)
{
	/* @saved_root_name: "ubi0:rootfs" */
	strlcpy(saved_root_name, line, sizeof(saved_root_name));
	return 1;
}

__setup("root=", root_dev_setup);

static int __init rootwait_setup(char *str)
{
	if (*str)
		return 0;
	root_wait = 1; /* @root_wait: 1 */
	return 1;
}

__setup("rootwait", rootwait_setup);

static char * __initdata root_fs_names;
static int __init fs_names_setup(char *str)
{
	root_fs_names = str; /* @root_fs_names: "ubifs" */
	return 1;
}

__setup("rootfstype=", fs_names_setup);
start_kernel()
	rest_init()
		pid = kernel_thread(kernel_init, NULL, CLONE_FS); /* 启动初始化线程 */

kernel_init()
	kernel_init_freeable()
		prepare_namespace()

/* init/do_mounts.c */
void __init prepare_namespace(void)
{
	if (saved_root_name[0]) { /* @saved_root_name: "ubi0:rootfs" */
		root_device_name = saved_root_name;
		if (!strncmp(root_device_name, "mtd", 3) ||
		    !strncmp(root_device_name, "ubi", 3)) {
			mount_block_root(root_device_name, root_mountflags); /* 挂载 rootfs */
			goto out;
		}
	}
	
	...

out:
	devtmpfs_mount("dev");
	ksys_mount(".", "/", NULL, MS_MOVE, NULL);
	ksys_chroot(".");
}

/* 挂载 rootfs */
void __init mount_block_root(char *name, int flags)
{
	...
	for (p = fs_names; *p; p += strlen(p)+1) {
		int err = do_mount_root(name, p, flags, root_mount_data);
		switch (err) {
		case 0:
			goto out; /* 成功挂载 rootfs */
		case -EACCES:
		case -EINVAL:
			continue;
		}
	}
	...

out:
	...
}

/* 
 * @name : "ubi0:rootfs" 
 * @fs   : "ubifs"
 * @flags: MS_SILENT
 * @data : NULL
 */
static int __init do_mount_root(char *name, char *fs, int flags, void *data)
{
	struct super_block *s;
 	int err = ksys_mount(name, "/root", fs, flags, data);
	...
}

/* fs/namespace.c */
int ksys_mount(char __user *dev_name, char __user *dir_name, char __user *type,
		unsigned long flags, void __user *data)
{
	...
	ret = do_mount(kernel_dev, dir_name, kernel_type, flags, options);
	...
}

long do_mount(const char *dev_name, const char __user *dir_name,
		const char *type_page, unsigned long flags, void *data_page)
{
	...
	if (flags & MS_REMOUNT)
		retval = do_remount(...);
	else if (flags & MS_BIND)
		retval = do_loopback(...);
	else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
		retval = do_change_type(...);
	else if (flags & MS_MOVE)
		retval = do_move_mount(...);
	else
		retval = do_new_mount(&path, type_page, sb_flags, mnt_flags,
				dev_name, data_page);
	...
}

static int do_new_mount(struct path *path, const char *fstype, int sb_flags,
		int mnt_flags, const char *name, void *data)
{
	struct file_system_type *type;
	struct vfsmount *mnt;
 
	...
	type = get_fs_type(fstype); /* @type: &ubifs_fs_type */
	...

	mnt = vfs_kern_mount(type, sb_flags, name, data);
	...

	put_filesystem(type);
	...
}

struct vfsmount *
vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data)
{
	struct mount *mnt;
	struct dentry *root;

	...
	
	mnt = alloc_vfsmnt(name);
	...

	root = mount_fs(type, flags, name, data);

	mnt->mnt.mnt_root = root;
	mnt->mnt.mnt_sb = root->d_sb;
	mnt->mnt_mountpoint = mnt->mnt.mnt_root;
	mnt->mnt_parent = mnt;
	lock_mount_hash();
	list_add_tail(&mnt->mnt_instance, &root->d_sb->s_mounts);
	unlock_mount_hash();
	return &mnt->mnt;
}

/* fs/super.c */
struct dentry *
mount_fs(struct file_system_type *type, int flags, const char *name, void *data)
{
	struct dentry *root;
	struct super_block *sb;
	...

	...
	root = type->mount(type, flags, name, data); /* ubifs_mount() */
	...
	sb = root->d_sb;
	...

	return root;
	...
}
/* fs/ubifs/super.c */

static struct dentry *ubifs_mount(struct file_system_type *fs_type, int flags,
		const char *name, void *data)
{
	struct ubi_volume_desc *ubi;
	struct ubifs_info *c;
	struct super_block *sb;
 	...

	/*
	 * Get UBI device number and volume ID. Mount it read-only so far
	 * because this might be a new mount point, and UBI allows only one
	 * read-write user at a time.
	 */
	ubi = open_ubi(name, UBI_READONLY);
	...

	c = alloc_ubifs_info(ubi);
	...

	sb = sget(fs_type, sb_test, sb_set, flags, c);
	...

	if (sb->s_root) {
		...
	} else {
		err = ubifs_fill_super(sb, data, flags & SB_SILENT ? 1 : 0);
		...
	}

	/* 'fill_super()' opens ubi again so we must close it here */
	ubi_close_volume(ubi);

	return dget(sb->s_root);
	
	...
}

static struct ubi_volume_desc *open_ubi(const char *name, int mode)
{
	struct ubi_volume_desc *ubi;
	...

	...

	/* First, try to open using the device node path method */
	ubi = ubi_open_volume_path(name, mode);
	
	...
}

struct ubi_volume_desc *ubi_open_volume_path(const char *pathname, int mode)
{
	...
	if (vol_id >= 0 && ubi_num >= 0)
		return ubi_open_volume(ubi_num, vol_id, mode);
	...
}

struct ubi_volume_desc *ubi_open_volume(int ubi_num, int vol_id, int mode)
{
	...
	/*
	 * First of all, we have to get the UBI device to prevent its removal.
	 */
	ubi = ubi_get_device(ubi_num);
	...

	desc = kmalloc(sizeof(struct ubi_volume_desc), GFP_KERNEL);
	...

	spin_lock(&ubi->volumes_lock);
	vol = ubi->volumes[vol_id]; /* (3) */
	...
	spin_unlock(&ubi->volumes_lock);
	
	desc->vol = vol;
	...

	return desc;
}

struct ubi_device *ubi_get_device(int ubi_num)
{
	struct ubi_device *ubi;
	
	...
	ubi = ubi_devices[ubi_num]; /* ubi_devices[] 在前一问题代码分析中的 ubi_attach_mtd_dev() 构建 */
	...

	return ubi;
}

static struct ubifs_info *alloc_ubifs_info(struct ubi_volume_desc *ubi)
{
	struct ubifs_info *c;

	c = kzalloc(sizeof(struct ubifs_info), GFP_KERNEL);
	if (c) {
		...
		ubi_get_volume_info(ubi, &c->vi);
		...
	}
}

void ubi_get_volume_info(struct ubi_volume_desc *desc,
			struct ubi_volume_info *vi)
{
	ubi_do_get_volume_info(desc->vol->ubi, desc->vol, vi);
}

void ubi_do_get_volume_info(struct ubi_device *ubi, struct ubi_volume *vol,
			struct ubi_volume_info *vi)
{
	...
	vi->usable_leb_size = vol->usable_leb_size;
	...
}

static int ubifs_fill_super(struct super_block *sb, void *data, int silent)
{
	...
	err = mount_ubifs(c);
	...
}

static int mount_ubifs(struct ubifs_info *c)
{
	...
	err = init_constants_early(c);
	...
	err = ubifs_read_superblock(c);
	...
}

static int init_constants_early(struct ubifs_info *c)
{
	...
	c->leb_size = c->vi.usable_leb_size;
	...
}

int ubifs_read_superblock(struct ubifs_info *c)
{
	...
	struct ubifs_sb_node *sup;

	...
	sup = ubifs_read_sb_node(c); /* 从写入到设备的 rootfs 镜像读取 superblock 信息 */
	
	...

	err = validate_sb(c, sup); /* 验证 rootfs 构建的 superblock 的合法性 */
	...
}

static int validate_sb(struct ubifs_info *c, struct ubifs_sb_node *sup)
{
	...

	if (le32_to_cpu(sup->leb_size) != c->leb_size) { /* (4) 内核报错日志 */
		ubifs_err(c, "LEB size mismatch: %d in superblock, %d real",
			le32_to_cpu(sup->leb_size), c->leb_size);
		goto failed;
	}

	...

failed:
	ubifs_err(c, "bad superblock, error %d", err);
	ubifs_dump_node(c, sup);
	return -EINVAL;
}

这个路径虽然很长,但并不复杂,在代码流程最后 validate_sb() 函数中的 (4) 处,内核报错。和前面的问题类似,又是一个 rootfs 镜像 rootfs.ubi 的参数 和 硬件参数 不匹配的问题,即 buildroot 中对应参数的配置问题。从前面的 open_ubi()alloc_ubifs_info() 代码流程分析得知,参数 c->leb_size 反应 NAND 硬件参数,而 sup->leb_size 参数值来自 rootfs.ubi 。从 buildrootfs/ubifs/ubifs.mk 的片段:

# -e 参数指定 LEB(Logical Erase Block) 的大小
UBIFS_OPTS := -e $(BR2_TARGET_ROOTFS_UBIFS_LEBSIZE) -c $(BR2_TARGET_ROOTFS_UBIFS_MAXLEBCNT) -m $(BR2_TARGET_ROOTFS_UBIFS_MINIOSIZE)

...

define ROOTFS_UBIFS_CMD
        $(HOST_DIR)/usr/sbin/mkfs.ubifs -d $(TARGET_DIR) $(UBIFS_OPTS) -o $@
endef

我们得知,ubinize-e 选项用配置项 BR2_TARGET_ROOTFS_UBIFS_LEBSIZE 修改 LEB(Logical Erase Block) 值,按日志提示:

[    2.965365] UBIFS error (ubi0:0 pid 1): ubifs_read_superblock: LEB size mismatch: 129024 in superblock, 126976 real

该值应该由 129024(0x1f800) 修改为 126976(0x1f000)
在这里插入图片描述重新编译,烧录运行,终于可以进入登录提示处了:

[    2.162228] ubi0: attaching mtd9
[    2.817396] ubi0: scanning is finished
[    2.841103] ubi0: volume 0 ("rootfs") re-sized from 83 to 1668 LEBs
[    2.848156] ubi0: attached mtd9 (name "NAND.rootfs", size 214 MiB)
[    2.854435] ubi0: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes
[    2.861339] ubi0: min./max. I/O unit sizes: 2048/2048, sub-page size 512
[    2.868081] ubi0: VID header offset: 2048 (aligned 2048), data offset: 4096
[    2.875082] ubi0: good PEBs: 1711, bad PEBs: 1, corrupted PEBs: 0
[    2.881199] ubi0: user volume: 1, internal volumes: 1, max. volumes count: 128
[    2.888463] ubi0: max/mean erase counter: 1/0, WL threshold: 4096, image sequence number: 1890895802
[    2.897644] ubi0: available PEBs: 0, total reserved PEBs: 1711, PEBs reserved for bad PEB handling: 39
[    2.907010] ubi0: background thread "ubi_bgt0d" started, PID 65
......
[    2.972922] UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 66
[    3.083296] UBIFS (ubi0:0): UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[    3.090747] UBIFS (ubi0:0): LEB size: 126976 bytes (124 KiB), min./max. I/O unit sizes: 2048 bytes/2048 bytes
[    3.122798] UBIFS (ubi0:0): FS size: 210399232 bytes (200 MiB, 1657 LEBs), journal size 9023488 bytes (8 MiB, 72 LEBs)
[    3.142800] UBIFS (ubi0:0): reserved for root: 0 bytes (0 KiB)
[    3.148663] UBIFS (ubi0:0): media format: w4/r0 (latest is w5/r0), UUID 7A19D54A-3848-4AFB-8DDF-4E4A6B04D4FC, small LPT model
[    3.184943] VFS: Mounted root (ubifs filesystem) on device 0:14.
[    3.192113] devtmpfs: mounted
...
[    3.555581] omap2-nand 8000000.nand: uncorrectable bit-flips found
[    3.572853] ubi0 warning: ubi_io_read: error -74 (ECC error) while reading 61 bytes from PEB 82:6144, read only 61 bytes, retry
...
Welcome
(none) login: 

虽然从日志 uncorrectable bit-flips found 可以看出,仍然还存在 ECC 报错的问题,但前面的两个问题都已经修正,而且也成功的进入系统登录界面。对于 ECC 报错的问题,在另一篇博文 Linux: ubi rootfs 故障案例 (2) 展开。

4. 参考资料

[1] https://bootlin.com/blog/creating-flashing-ubi-ubifs-images/
[2] UBI FAQ and HOWTO
[3] UBI - Unsorted Block Images

  • 15
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
这段信息是来自一个嵌入式系统的内核日志。它提供了关于 UBI(Unsorted Block Images)的一些信息。UBI是用于在闪存设备上管理逻辑块的软件层。 具体解读如下: - `ubi2: attaching mtd10`:将 MTD(Memory Technology Device)设备 mtd10 附加到 UBI 上。 - `ubi2: scanning is finished`:扫描 mtd10 完成。 - `ubi2: attached mtd10 (name "misc1", size 8 MiB)`:成功将 mtd10(名称为 "misc1")附加到 UBI,大小为 8 MiB。 - `ubi2: PEB size: 131072 bytes (128 KiB), LEB size: 126976 bytes`:UBI 的物理块大小(PEB)为 131072 字节(128 KiB),逻辑块大小(LEB)为 126976 字节。 - `ubi2: min./max. I/O unit sizes: 2048/2048, sub-page size 2048`:UBI 的最小和最大 I/O 单位大小为 2048 字节,子页面大小为 2048 字节。 - `ubi2: VID header offset: 2048 (aligned 2048), data offset: 4096`:VID(Volume IDentifier)头偏移量为 2048 字节,数据偏移量为 4096 字节。 - `ubi2: good PEBs: 64, bad PEBs: 0, corrupted PEBs: 0`:好的 PEB(Physical Erase Block)数量为 64,坏的 PEB 数量为 0,损坏的 PEB 数量为 0。 - `ubi2: user volume: 1, internal volumes: 1, max. volumes count: 128`:用户卷数量为 1,内部卷数量为 1,最大卷数量为 128。 - `ubi2: max/mean erase counter: 8/3, WL threshold: 4096, image sequence number: 1725884149`:最大/平均擦除计数器为 8/3,WL(Wear-Leveling)阈值为 4096,镜像序列号为 1725884149。 这些信息提供了有关 UBI 和相关设备的详细配置和状态信息。它们对于系统调试和问题排查可能是有用的。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值