一起分析Linux系统设计思想——02Makefile简析(下）

本文链接：https://blog.csdn.net/weixin_44873133/article/details/108892886

在学习资料满天飞的大环境下，知识变得非常零散，体系化的知识并不多，这就导致很多人每天都努力学习到感动自己，最终却收效甚微，甚至放弃学习。我的使命就是过滤掉大量的垃圾信息，将知识体系化，以短平快的方式直达问题本质，把大家从大海捞针的痛苦中解脱出来。

文章目录

4 自定义段（接上一篇）

Tips：自定义段技术在我的文章《基于自定义段技术构建指令系统》中已经有过介绍了。但是底层的编程和应用程序还是有一些区别，这里重点讲区别，并分析内核中自定义这么多段的原因。

下面这段代码是Linux2.6中的lds文件，我们会发现其中有好多自定义段，接下来我们以 .text.head 和 .arch.info.init 为例来进行分析。

SECTIONS
{
 . = (0xc0000000) + 0x00008000;

 .text.head : {
  _stext = .;
  _sinittext = .;
  *(.text.head)
 }

 .init : { /* Init code and data		*/
   *(.init.text)
  _einittext = .;
  __proc_info_begin = .;
   *(.proc.info.init)
  __proc_info_end = .;
  __arch_info_begin = .;
   *(.arch.info.init)
  __arch_info_end = .;

4.1 段的定义

4.1.1 C语言定义方式

C语言中自定义段的代码如下，此处不作详细解释了（请参照《基于自定义段技术构建指令系统》相关章节）。

只说一点区别：这里自定义段名称是没有限制的（比如段名可以以“.”开头）。这是因为，在底层编程，lds脚本也是自己写的（而不是自动生成的），所以，不再受应用编程规则的限制（是不是开始喜欢上底层编程了呢~~）。

/*
 * Set of macros to define architecture features.  This is built into
 * a table by the linker.
 */
#define MACHINE_START(_type,_name)			\
static const struct machine_desc __mach_desc_##_type	\
 __used		/*表示该函数或变量可能不使用，这个属性可以避免编译器产生警告信息*/					\
 __attribute__((__section__(".arch.info.init"))) = {	\ /*该行定义了.arch.info.init段*/
	.nr		= MACH_TYPE_##_type,		\
	.name		= _name,

#define MACHINE_END				\
};

4.1.2 汇编定义方式

.text.head 的定义如下：

	.section ".text.head", "ax" @定义一个段名为.text.head的段，段的属性是可分配可运行
	.type	stext, %function  @声明stext为函数
ENTRY(stext)
	msr	cpsr_c, #PSR_F_BIT | PSR_I_BIT | SVC_MODE @ ensure svc mode

Tips：用户可以通过.section伪操作来自定义一个段，格式为：
.section section_name [, “flags”[, %type[,flag_specific_arguments]]]

每一个段从段名为开始, 到下一个段名或者文件结尾结束。这些段都有缺省的标志（flags）。

flags定义如下：

flags	meaning	说明
a	allowable section	可分配的段
w	writable section	可写段
x	executable section	可运行段

4.2 段的引用

重点注意C语言与汇编对段引用的区别——前者认为段的起始和结束符号是变量，而汇编认为是地址。示例代码偏多，可以忽略其中的逻辑。

4.2.1 C语言引用方式

在lds中定义段的起始和结束标识：

  __tagtable_begin = .; @ 定义段的起始标号
   *(.taglist.init)
  __tagtable_end = .;   @ 定义段的结束标号

C语言的引用如下述示例代码。从代码中我们可以断定C语言对 __tagtable_begin/end 识别为变量，而不是地址。

/*
 * Scan the tag table for this tag, and call its parse function.
 * The tag table is built by the linker from all the __tagtable
 * declarations.
 */
static int __init parse_tag(const struct tag *tag)
{
	extern struct tagtable __tagtable_begin, __tagtable_end; /*这里声明为strct类型变量*/
	struct tagtable *t;

	for (t = &__tagtable_begin; t < &__tagtable_end; t++) /*这里的&符号进一步证明了__tagtable_begin/end 被识别为变量，而不是地址*/
		if (tag->hdr.tag == t->tag) {
			t->parse(tag);
			break;
		}

	return t < &__tagtable_end;
}

4.2.2 汇编引用方式

在lds中定义段的起始和结束标识：

  __arch_info_begin = .;
   *(.arch.info.init)
  __arch_info_end = .;

汇编代码对lds中定义的段起始结束标识识别代码如下所示。代码中多处证实了汇编将 __arch_info_begin/end 识别为了地址，而非变量。为什么会这样呢？？？

因为，汇编语言中，除了寄存器可以认为是变量外，所有与内存相关的操作都是使用地址，所以，在汇编中引用的所有与内存相关的符号全部识别为地址处理。认识到这一点非常重要。

/*
 * Look in include/asm-arm/procinfo.h and arch/arm/kernel/arch.[ch] for
 * more information about the __proc_info and __arch_info structures.
 */
	.long	__proc_info_begin
	.long	__proc_info_end
3:	.long	.
	.long	__arch_info_begin /*起始和结束的标记在这里声明*/
	.long	__arch_info_end

/*
 * Lookup machine architecture in the linker-build list of architectures.
 * Note that we can't use the absolute addresses for the __arch_info
 * lists since we aren't running with the MMU on (and therefore, we are
 * not in the correct address space).  We have to calculate the offset.
 *
 *  r1 = machine architecture number
 * Returns:
 *  r3, r4, r6 corrupted
 *  r5 = mach_info pointer in physical address space /*从这里可以看出r5是指向mach_info的指针变量，也就是说r5中存储的是mach_info类型变量的地址*/
 */
	.type	__lookup_machine_type, %function
__lookup_machine_type:
	adr	r3, 3b             @ adr伪指令将标号3的物理地址（实际存在的地址）赋值给r3寄存器
	ldmia	r3, {r4, r5, r6} @ ldmia多寄存器寻址指令等价于： r4 <- [r3] = "."(.代表该行代码的虚拟地址)
	                        @ r5 <- [r3+4] = __arch_info_begin(虚拟地址，***划重点，这里表明__arch_info_begin是地址，不是变量), r6 <- [r3+4*2]
	sub	r3, r3, r4			@ get offset between virt&phys
	add	r5, r5, r3			@ convert virt addresses to
	add	r6, r6, r3			@ physical address space
1:	ldr	r3, [r5, #MACHINFO_TYPE]	@ get machine type,寄存器相对寻址（这里的寻址进一步证明__arch_info_begin是地址，不是变量，也间接说明了为什么汇编会将其识别为地址。，MACHINFO_TYPE为type在结构体arch_info中的偏移
	teq	r3, r1				@ matches loader number? 判断r3中的值是否与r1中的值相等
	beq	2f				@ found 如果相等就跳转到标号2处，此时r5指向的就是匹配的arch_info结构体
	add	r5, r5, #SIZEOF_MACHINE_DESC	@ next machine_desc 否则，r5指针向前跳一个arch_info结构体的大小
	cmp	r5, r6              @ 检查r5此时是否已经指向__arch_info_end处
	blo	1b                  @ r5 小于 r6时跳转到标号1处
	mov	r5, #0				@ unknown machine 否则给r5清零，并退出循环
2:	mov	pc, lr