MIT6.828 lab1 exercise7~8

最新推荐文章于 2024-08-14 21:58:44 发布

Bunnyissopretty

最新推荐文章于 2024-08-14 21:58:44 发布

阅读量524

点赞数

文章标签： linux c语言

本文链接：https://blog.csdn.net/Sunissopretty/article/details/117935043

版权

先记录突然出现的一个小坑：make qemu-gdb之后make gdb忽然开始连接超时了，查阅了相关问题后，发现是因为监听端口在25000，而这时候是在26000了，所以应该找到gnumakefile文件，将25000改为26000，之后就无问题了。

Exercise 7

why map?
Many machines don’t have any physical memory at address 0xf0100000,

In fact, in the next lab, we will map the entire bottom 256MB of the PC’s physical address space, from physical addresses 0x00000000 through 0x0fffffff, to virtual addresses 0xf0000000 through 0xffffffff respectively. You should now see why JOS can only use the first 256MB(through 0x00000000 to 0x0fffffff) of physical memory.

Operating system kernels often like to be linked and run at very high virtual address, such as 0xf0100000, in order to leave the lower part of the processor’s virtual address space for user programs to use.(a big hole in the middle) .

For lab1: For now, we’ll just map the first 4MB of physical memory, which will be enough to get us up and running. in kern/entrypgdir.c
that means map 0xf0400000-0xf0000000 to 0x00400000-0x00000000,
Any virtual address that is not in one of these two ranges will cause a hardware exception which, since we haven’t set up interrupt handling yet, will cause QEMU to dump the machine state and exit (or endlessly reboot if you aren’t using the 6.828-patched version of QEMU).
Up until kern/entry.S sets the CR0_PG flag, memory references are treated as physical addresses, Once CR0_PG is set, memory references are virtual addresses that get translated by the virtual memory hardware to physical addresses.
To sum up, memory references are: CR0_PG = 0, physical address; kern/entry.S sets CR0_PG = 1, virtual addresses.
CR0_PG: bit 31, Paging bit，若要开启Page mechanism，则PG，PE都要设置为1

根据题目提示，stepi to trace in JOS kernel and you will find that :

0x100025:	mov    %eax,%cr0
(gdb) x/i 0x100000
   0x100000:	add    0x1bad(%eax),%dh
(gdb) x/i 0xf0100000
   0xf0100000 <_start+4026531828>:	add    %al,(%eax)
(gdb) x/4x 0x100000
0x100000:	0x02	0xb0	0xad	0x1b
(gdb) x/4x 0xf0100000
0xf0100000 <_start+4026531828>:	0x00	0x00	0x00	0x00	


(gdb) si
=> 0x100028:	mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/4x 0x100000
0x100000:	0x02	0xb0	0xad	0x1b
(gdb) x/4x 0xf0100000
0xf0100000 <_start+4026531828>:	0x02	0xb0	0xad	0x1b	
(gdb) x/i 0xf0100000
   0xf0100000 <_start+4026531828>:	add    0x1bad(%eax),%dh

We can find out that after the instruction:
movl %eax,%cr0 (%eax =0x80010001 )

when we reference 0xf0100000, we are referencing 0x00100000。

Exercise 8 Formatted Printing to the Console

这一个练习主要就是通读一下kern/printf.c, lib/printfmt.c, and kern/console.c三个文件，具体内部细节根据题目要求选择性了解即可，但是大致作用还是要看一下的。

Basic knowledge：

Serial communication：
the process of sending data one bit at a time, which is different from parallel communication.

console device ：
It usually means a combination of a display monitor and an input device, usually a keyboard and mouse pair, which allows a user to input commands and receive visual output from a computer or computer system

第一个需要回答的问题：
1.Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?也就是问这三个文件如何交互的。

首先看print.c文件开头的注释：
// Simple implementation of cprintf console output for the kernel,
// based on printfmt() and the kernel console’s cputchar().
可以看到printf的实现基于printfmt()和console里的cputchar()函数，那么先看一下cputchar()这个函数：

// `High'-level console I/O.  Used by readline and cprintf.

void
cputchar(int c)
{
	cons_putc(c);
}

再看一下调用的cons_putc() 函数：

// output a character to the console
static void
cons_putc(int c)
{
	serial_putc(c);
	lpt_putc(c);
	cga_putc(c);
}

调用了serial_putc()，lpt_putc()，cga_putc()三个函数，依次做一点解释，首先是：

static void
serial_putc(int c)
{
	int i;
    //MACRO里COM1 = 0x3F8, COM_LSR = 5, COM_LSR_TXRDY = 0x20,
	for (i = 0;
	     !(inb(COM1 + COM_LSR) & COM_LSR_TXRDY) && i < 12800;
	     i++)
		delay();
    //COM_TX = 0
	outb(COM1 + COM_TX, c);
}

这个函数首先从端口（0x3F8 + 5）里读之后再和0x20与 => 检查端口内读取的值bit 5是否为1，然后再非一下，若bit 5 = 1，&&左表达式的结果为0，反之为1。
还是网站（https://bochs.sourceforge.io/techspec/PORTS.LST）上查询一下端口0x3FD的意义，可以看到：
03FD r line status register
bit 5 = 1 transmitter holding register empty. Controller is
ready to accept a new character to send
也就是说如果bit 5 = 1，那么传输方寄存器已空，controller可以接受一个新的字符了。（In computing and especially in computer hardware, a controller is a chip (such as a microcontroller), an expansion card, or a stand-alone device that interfaces with a more peripheral device）
也就是说serial_putc()函数的功能首先就是在bit 5 =1 的时候，跳出循环，否则只要 i <12800就会一直循环等待。接下来outb(0x3F8,c)，看一下这个端口：
03F8
w serial port, transmitter holding register, which contains the
character to be sent. Bit 0 is sent first.
bit 7-0 data bits when DLAB=0 (Divisor Latch Access Bit)
r receiver buffer register, which contains the received character
Bit 0 is received first
bit 7-0 data bits when DLAB=0 (Divisor Latch Access Bit)
r/w divisor latch low byte when DLAB=1
也就是说往0x3F8端口（transmitter holding register）里输入即将要传送的数据c，

/***** Parallel port output code *****/
// For information on PC parallel port programming, see the class References
// page.

static void
lpt_putc(int c)
{
	int i;

	for (i = 0; !(inb(0x378+1) & 0x80) && i < 12800; i++)
		delay();
	outb(0x378+0, c);
	outb(0x378+2, 0x08|0x04|0x01);
	outb(0x378+2, 0x08);
}

lpt_putc()函数的作用很直白，看下面的端口解释就知道了，并行端口输入
0378 w data port
0379 r/w status port
037A r/w control port

接下来是关于cga_putc()函数的分析，个人觉得比较重要的细节都注释在代码中了。

static void
cga_putc(int c)
{
	// if no attribute given, then use black on white
	if (!(c & ~0xFF))
		c |= 0x0700;

	switch (c & 0xff) {
	//退格，指示cursor position的指针的crt_pos --
	case '\b':
		if (crt_pos > 0) {
			crt_pos--;
			crt_buf[crt_pos] = (c & ~0xff) | ' ';
		}
		break;
	//注意小细节，因为\n是换行并且直接移到下一行的开头（最简化的情况）
	//所以这里除了移到下一行同一个位置还需要回到开头，也就是\r所做的
	case '\n':
		crt_pos += CRT_COLS;
		/* fallthru */
	case '\r':
		crt_pos -= (crt_pos % CRT_COLS);
		break;
	case '\t':
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		cons_putc(' ');
		break;
	default:
		crt_buf[crt_pos++] = c;		/* write the character */
		break;
	}

	// What is the purpose of this?
	//如果cursor position > CRT_SIZE
	if (crt_pos >= CRT_SIZE) {
		int i;

		memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
		for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
			crt_buf[i] = 0x0700 | ' ';
		crt_pos -= CRT_COLS;
	}

	/* move that little blinky thing */
	//注意：port 000F	w	DMA write mask register
	//port 0010-001F ----	DMA controller (8237) on PS/2 model 60 & 80，
	//总之就是对于crt_pos，先将其高位字节输进去，再输入低字节？？
	//把缓冲区的内容输出给显示屏
	outb(addr_6845, 14);
	outb(addr_6845 + 1, crt_pos >> 8);
	outb(addr_6845, 15);
	outb(addr_6845 + 1, crt_pos);
}

关于console i/o的一些补充知识（同时支持Unicode和ANSI，所以是表达的时候16位的）：
增添Unicode码内容：它前128个字符就是ASCII码，之后是扩展码。在Unicode码中，各个字符块基于同样的标准。其中有希腊字母，西里尔文，亚美尼亚文，希伯来文等。而汉文，韩语，日语的象形文字占用从0X3000到0X9FFF的代码。最杰出的地方是，它只有一个字符集，有效的避免了双字节字符集的二义性。缺点是：占用的内存空间比ASCII大1倍。

接下来回答问题2：
Explain the following from console.c：

	if (crt_pos >= CRT_SIZE) {
		int i;

		memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
		for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
			crt_buf[i] = 0x0700 | ' ';
		crt_pos -= CRT_COLS;
	}

void * memmove ( void * destination, const void * source, size_t num );
/* Copies the values of num bytes from the location pointed by source to the memory block pointed by destination. Copying takes place as if an intermediate buffer were used, allowing the destination and source to overlap. */

意思就是在输出内容超过屏幕范围的时候，将输出缓存中的内容整体上移一行(-CRT_COLS)，然后将最后一行(CRT_SIZE-CRT_COLS)~(CRT_SIZE-1)全部用黑色空格填满（crt_buf[i] = 0x0700 | ’ '）。

问题3：
观察一下kernel.asm文件，对应可以看到执行的时候会先跳转到init.c，我们可以在init.c中加入对应的代码然后逐步分析：

int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);

题目要求我们回答两个子问题：
1、In the call to cprintf(), to what does fmt point? To what does ap point?
2、List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

答1：
显然由下面结果可以看出，fmt和ap存储的地址分别指向的目标，不多赘述

=> 0xf010091e <vcprintf>:	push   %ebp

Breakpoint 5, vcprintf (fmt=0xf0101952, ap=0xf010ffe4) at kern/printf.c:18
18	{
(gdb) x/s 0xf0101952
0xf0101952:	"x %d, y %x, z %d\n"
(gdb) x/16x 0xf010ffe4
0xf010ffe4:	0x01	0x00	0x00	0x00	0x03	0x00	0x00	0x00
0xf010ffec:	0x04	0x00	0x00	0x00	0x00	0x00	0x00	0x00
(gdb)

答2：
1、cons_putc() 第一次 && 第二次

Breakpoint 2, cons_putc (c=-267380398) at kern/console.c:434
(gdb) p/x -267380398 
$1 = 0xf0101952
0xf0101952:	120 'x'

第二次是     32  //空格

2、va_arg() 第一次

//这里引用大神的分析：
//正是因为C函数调用实参的入栈顺序是从右到左的，才使得调用参数个数可变的函
//数成为可能(且不用显式地指出参数的个数)。但是必须有一个方式来告诉实际调用
//时传入的参数到底是几个，这个是在格式化字符串中指出的。
//va_start宏计算fmt后面的地址并赋值给ap。
//va_end宏用于重置ap宏，使它指向NULL(不过看cprintf的反汇编代码中，
//va_end好像啥都没干，不知道是为什么)。

//个人补充：
//可变参数是由宏实现的，但是由于硬件平台的不同，编译器的不同，
//宏的定义也不相同，下面是VC6.0中x86平台的定义
//（在本实验中参数顺序会有差别，看名字分辨即可）:
 
typedef char * va_list;     /* TC中定义为void*  */
#define _INTSIZEOF(n)    ((sizeof(n)+sizeof(int)-1)&~(sizeof(int) - 1) ) /*为了满足需要内存对齐的系统*/
#define va_start(ap,v)    ( ap = (va_list)&v + _INTSIZEOF(v) )     /*ap指向第一个变参的位置，即将第一个变参的地址赋予ap*/
#define va_arg(ap,t)       ( *(t *)((ap += _INTSIZEOF(t)) - _INTSIZEOF(t)) )   /*获取变参的具体内容，t为变参的类型，如有多个参数，则通过移动ap的指针来获得变参的地址，从而获得内容*/
#define va_end(ap) ( ap = (va_list)0 )   /*清空va_list，即结束变参的获取*/

//ap应该是在va_arg调用完成之后，从0xf010ffe4变为0xf010ffe8，
//这两个地址存储的正好是1和3

=> 0xf0100da8 <getint+44>:	mov    (%eax),%edx
78			return va_arg(*ap, int);
(gdb) p ap
$2 = (va_list *) 0xf010ff9c
(gdb) print *(long *) 0xf010ff9c
$5 = -267321372
(gdb) p/x -267321372
$6 = 0xf010ffe4

=> 0xf010107b <vprintfmt+671>:	mov    %eax,-0x20(%ebp)
0xf010107b in vprintfmt (putch=0xf010090b <putch>, putdat=0xf010ffac, fmt=0xf0101952, 
    ap=0xf010ffe8) at lib/printfmt.c:195
195				num = getint(&ap, lflag);

3.第三次cons_putc()，第三次此时c = 1
后面的就是重复同样的模式。

问题4：
Run the following code.
unsigned int i = 0x00646c72;
cprintf(“H%x Wo%s”, 57616, &i)
问输出是什么？
如果是大端（现在是小端序），i应该变成什么，57616要变吗？

答：He11o World
i变成4个字节倒过来的就行，也就是0x726c7600,57616不用变/

问题5
如下代码：
cprintf(“x=%d y=%d”, 3);
在输出"y="之后会输出什么？

答：由于每次输出应该是根据ap所指向的值，所以逻辑上猜测，应该会往后移一定距离，如第三题所说移动比如说4个字节，然后将里面的东西当作十进制整数输出

接下来调试验证一下：

[root@localhost lab]# make qemu-gdb
***
*** Now run 'make gdb'.
***
qemu-system-i386 -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::26000 -D qemu.log  -S
VNC server running on `::1:5900'
6828 decimal is 15254 octal!
x 1, y 3, z 4
He110 Worldx=3 y=-267321364enter

//ap的地址变化为0xf010ffd4到0xf010ffd8
//对比上面的输出发现确实是将两个地址存储的值当作整数输出
//一个是3，另一个则是-267321364（后面的enter不用管，
//那是后面的程序输出的）
(gdb) print *(long *) 0xf010ffd4 
$10 = 3
(gdb) print *(long *) 0xf010ffd8
$11 = -267321364

问题6：
这里直接贴上大佬的分析吧：
https://github.com/clpsz/mit-jos-2014/tree/master/Lab1/Exercise08#:~:text=%23%236,%E8%AE%A8%E8%AE%BA%E3%80%82

Bunnyissopretty

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
MIT6.828 lab1 exercise7~8

先记录突然出现的一个小坑：make qemu-gdb之后make gdb忽然开始连接超时了，查阅了相关问题后，发现是因为监听端口在25000，而这时候是在26000了，所以应该找到gnumakefile文件，将25000改为26000，之后就无问题了。Exercise 7why map?Many machines don’t have any physical memory at address 0xf0100000,In fact, in the next lab, we will map the
复制链接

扫一扫