2020 6.s081——Lab4：traps

John_Snowww

已于 2024-05-31 23:08:47 修改

阅读量1k

点赞数 16

文章标签： 6.s081 6.828 wsl2 github

于 2024-05-30 17:23:20 首次发布

本文链接：https://blog.csdn.net/john_snowww/article/details/139328506

版权

我会等冬的雪融化

蒲扇里的夏

守着深夜里的星星

眨眼不说话

——我会等

完整代码见：SnowLegend-star/6.s081 at traps (github.com)

RISC-V assembly (easy)

Q1：Which registers contain arguments to functions? For example, which register holds 13 in main's call to printf?

A：用户代码将exec的参数放在寄存器a0和a1中，并将系统调用放在a7中。当系统调用返回时，其返回值记录在p->trapframe->a0中。
a7存放编号为13的系统调用。

Q2：Where is the call to function f in the assembly code for main? Where is the call to g? (Hint: the compiler may inline functions.)

A：汇编代码main函数中没有找到调用f的地方，应该是被内联取代函数调用了，直接设置为12。

Q3：At what address is the function printf located?

A： printf位于“0x630”处。

Q4：What value is in the register ra just after the jalr to printf in main?

A： ra 通常指的是 RISC-V 架构中的返回地址寄存器，用于存储函数调用后的返回地址。所以这里ra的值应为“0x38”。

Q5：Run the following code.

unsigned int i = 0x00646c72;
printf("H%x Wo%s", 57616, &i);

Q5.1：What is the output? Here's an ASCII table that maps bytes to characters.

A：输出是“Hello World”。

Q5.2：The output depends on that fact that the RISC-V is little-endian. If the RISC-V were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

A：因为xv6是小端存储，所以i在内存中实际的存放值应该是“0x726c6400”。同时，系统从地址低位往地址高为打印字符，这恰好对应“rld”。如果机器是大端存储(更符合日常书写习惯)，则需要把i改为“0x726c6400”。
而我们不需要修改 57616 的值，因为它是以十六进制形式打印的，并且十六进制的表示方式不受字节序影响。无论系统的字节序如何，57616 在十六进制中表示的值始终是 0xe110。

Here's a description of little- and big-endian and a more whimsical description.

Q6：In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?

printf("x=%d y=%d", 3);

A： x=3 y=-80204024。
y的值从内存残留的值里面进行读取。

Backtrace (moderate)

果然moderate难度还是令人舒适，有点难度但不多。这个lab的主旨就是一句话——“用栈指针打印每个函数调用的返回地址”。下面我们分析下hints：

1、Add the prototype for backtrace to kernel/defs.h so that you can invoke backtrace in sys_sleep.

让我们在defs.h中添加backtrace的函数名，都是老生常谈了。

2、The GCC compiler stores the frame pointer of the currently executing function in the register s0. Add the following function to kernel/riscv.h:

static inline uint64

r_fp()

{

uint64 x;

asm volatile("mv %0, s0" : "=r" (x) );

return x;

}

and call this function in backtrace to read the current frame pointer. This function uses in-line assembly to read s0.

“GCC 编译器将当前正在执行的函数的帧指针存储在寄存器 s0 中”这句话还是比较关键的。怪不得对alarm部分进行debug的时候总感觉s0应该不是拿来存变量，而是存与函数执行有关的地址。原来前文已经说过了。

3、These lecture notes have a picture of the layout of stack frames. Note that the return address lives at a fixed offset (-8) from the frame pointer of a stackframe, and that the saved frame pointer lives at fixed offset (-16) from the frame pointer.

Lecture055的这张图太重要了，建议狠狠背下来。根据这张图也可以看出来ra位于fp的-8偏移处，而saved fp(即caller函数的fp)位于当前fp的-16偏移处。而且根据图来看caller函数和callee函数的栈帧貌似是有排列规律的，我在CSDN看到了这样一句话“一个 Stack Frame 称为一个函数栈空间，每一个函数均有自己独立的 Stack Frame，并且调用链之间的 Stack Frame 是连续的”问了下GPT确实如此：

连续的栈帧：当一个函数调用另一个函数时，被调用函数的栈帧会被压入栈中，形成一个新的栈帧。这些栈帧是连续的，因为它们按照调用顺序依次存放在栈上。也就是说，每一个新的栈帧都位于上一个栈帧的下方，形成了一个函数调用链。

4、Xv6 allocates one page for each stack in the xv6 kernel at PAGE-aligned address. You can compute the top and bottom address of the stack page by using PGROUNDDOWN(fp) and PGROUNDUP(fp) (see kernel/riscv.h. These number are helpful for backtrace to terminate its loop.

本实验要实现一个内核函数 backtrace，该函数用来回溯调用者的函数调用链，将所有的调用者的 pc 打印出来，也就子函数的 ra。只需要从当前 Stack Frame 开始，通过 Prev 找到调用者的 Stack Frame，直至调用链结束为止。

接下来，在 kernel/printf.c 中实现 backtrace，调用 r_fp 来获得栈指针 fp，打印出 *(fp - 8)，然后跳转到 *(fp - 16) 即可，以此循环。那么问题来了，循环结束点（最后一个 Stack Frame）在哪呢？

整个栈空间是有个范围的，所有的函数的 Stack Frame 均在其中，并且每一个栈指针都是 4k 对齐的，因此如果 fp 不是 4k 地址对齐，那么就说明超过范围了。

换句话说，在 xv6 中，使用一个页来存储栈，如果 fp 已经到达栈页的上界，则说明已经到达栈底。

r_fp()实现如下

//实现backtrace
static inline uint64
r_fp()
{
  uint64 x;
  asm volatile("mv %0, s0" : "=r" (x) );
  return x;
}

bacetrace()如下

//通过栈指针来追踪函数调用的层次结构
//打印每个函数调用的返回地址
void backtrace(){
  uint64 fp=r_fp();
  uint64 return_address;
  while(fp!=0 && fp> PGROUNDDOWN(fp) && fp<PGROUNDUP(fp)){
    return_address=*(uint64*)(fp-8);    
    printf("%p\n",return_address);
    fp=*(uint64*)(fp-16);     //返回到上一个调用栈的栈帧中
  }
}

Alarm (hard)

这个部分需要添加的代码并不算多，主要还是理清楚函数调用和中断处理。本 lab 的直接要求时实现 sigalarm，什么意思呢？就是当进程调用 sigalarm 时，就会按照 CPU 时钟来定时的执行某一个函数。比如，test0调用了 sigalarm(2, func)，那该进程就会每隔 2 个 CPU 时钟调用一次 func。这里sigalarm类似于CSAPP的信号机制，而不是说test0中只调用了一次sigalarm( 2 , func )，那就只会执行一次func，而是每次中断都会执行func。一共分为 test0、test1、test2、test3，又浅入深的逐步实现该操作。

test0: invoke handler

还是从hitns入手：

1、You'll need to modify the Makefile to cause alarmtest.c to be compiled as an xv6 user program.

2、The right declarations to put in user/user.h are:

int sigalarm(int ticks, void (*handler)());

int sigreturn(void);

3. Update user/usys.pl (which generates user/usys.S), kernel/syscall.h, and kernel/syscall.c to allow alarmtest to invoke the sigalarm and sigreturn system calls.

上面几个步骤类似的已经完成好多遍了，故不再赘述。

4、For now, your sys_sigreturn should just return zero.

test0的主要目的是更改中断结束后返回的地址。原本中断结束后，会返回到中断指令的下一条指令。而sigalarm( n, fn )则是让中断返回到函数fn处。但是本阶段还不涉及sigreturn的更改。

5、Your sys_sigalarm() should store the alarm interval and the pointer to the handler function in new fields in the proc structure (in kernel/proc.h).

这里提到我们要在sys_sigalarm()内部把用户空间调用的n和fn这两个参数接收过来。当然，先要为每个进程添加time_interval和handler_function两个参数。

6、You'll need to keep track of how many ticks have passed since the last call (or are left until the next call) to a process's alarm handler; you'll need a new field in struct proc for this too. You can initialize proc fields in allocproc() in proc.c.

这个hint说要给进程添加一个新的参数ticks_passed，以此来记录从上一次调用handler_function已经过去了多少个ticks。在allocproc()可以进行ticks_passed的初始化，其实我感觉在sys_sigalarm内部也可以就进行这个初始化。有一点我是比较费解的，要在哪里进程ticks_passed++的操作呢？

7、Every tick, the hardware clock forces an interrupt, which is handled in usertrap() in kernel/trap.c.

这个提示就能较好地回答上面那个问题。每次tick时，都会产生一个时钟中断，所以可以倒推当发生中断时，就已经过了一个ticks。其实不够严谨，因为有三种情况都可以产生中断：

系统调用；
程序出现了类似page fault、运算时除以0的错误，即 panic；
一个设备触发了中断使得当前程序运行需要响应内核设备驱动；

hint8会解释这一点。所以，我们只需要在usertrap()内部找个地方就可以添加ticks_passed++的操作了。

8、You only want to manipulate a process's alarm ticks if there's a timer interrupt; you want something like

if(which_dev == 2) ...

这里提到，只有进行timer interrupt的时候才会处理ticks，所以ticks_passed++的位置就应该放在if(which_dev==2)这个判断的内部。

9、Only invoke the alarm function if the process has a timer outstanding. Note that the address of the user's alarm function might be 0 (e.g., in user/alarmtest.asm, periodic is at address 0).

我感觉这个hint好像没怎么用到，描述也是含糊其辞。

10、You'll need to modify usertrap() so that when a process's alarm interval expires, the user process executes the handler function. When a trap on the RISC-V returns to user space, what determines the instruction address at which user-space code resumes execution?

我们知道，系统是通过trapframe内部的epc寄存器来存储ra的。所以我把epc寄存器的值改为handler_function的地址就可以了。

11、It will be easier to look at traps with gdb if you tell qemu to use only one CPU, which you can do by running

make CPUS=1 qemu-gdb

这个hint也没怎么用到。在以后的多线程调试的时候可能会用到。

其实test0并不困难，但我还是被test0卡了好久，一直报下列错误

最后发现居然是下列问题。当真是对函数调用流程的理解不能有一点含糊啊。

test1/test2(): resume interrupted code

什么时候会发生中断？

在for循环执行时被中断
在调用foo程序时候中断
在调用periodic时中断(几乎不可能，两次时钟中断肯定会执行完这部分代码)

Hints分析如下：

1、Your solution will require you to save and restore registers---what registers do you need to save and restore to resume the interrupted code correctly? (Hint: it will be many).

test0的作用是让中断结束后返回到handl_function()继续执行，而这个部分则是让函数调用变得完整，可以顺利返回到导致中断的指令的下一句指令。我们需要维护好几个寄存器。

2、Have usertrap save enough state in struct proc when the timer goes off that sigreturn can correctly return to the interrupted user code.

大致流程如下：假设在x指令出发生了中断，而sigalarm将epc返回到periodic中，periodic中又声明了sigreturn这个信号处理函数，借助sigreturn又可以返回到x的下一条指令中继续执行。相当于绕一圈再回到原地。

我们看这个图，可以发现callee函数要保存的寄存器是ra、sp、s0~s11。具体要保存哪些寄存器的值还要仔细分析哪些寄存器被改变了。还有个epc寄存器也要维护。

直接运行test0的代码时，会发现j的值一直是0。这就需要我们修改代码了。

首先保存一个新的epc值，因为epc被sigalarm修改指向了handler_function。我们要令epc最后指向原本应返回的地址epc+4。

接下来是看test1中是否还有其他函数调用了ra、sp、s0~s11

foo()：涉及s0、s1、sp、ra

periodic()：涉及sp、ra、s0

所以我们初步维护上面四个寄存器的值试试。但是sigreturn的作用是什么呢？

带着这个问题，我们在usertrap中维护上述寄存器后，在哪儿把这些保存好的寄存器的值写入到被改变的reg中呢？就是在sigreturn中。逻辑自洽了。

尝试运行下代码，发现i、j的值已经相差很近了：

有个搞笑的事儿，我连着运行了几次alarmtest发现i、j的值越来越近，就想着能不能投机取巧。结果下一次运行结果还是让我的侥幸幻想破灭了，可恶

言归正传，现在开始考虑是不是要维护涉及存储和运算i、j的寄存器了。再次检查test1中涉及i、j的汇编代码，同时foo()函数也要重点关注。

test1()：a1存j的值，a0存i的值

foo()：a5与s1参与j的运算

故把上述寄存器也进行维护，结果可以通过test1，但是没解决重入性的问题。

3、Prevent re-entrant calls to the handler----if a handler hasn't returned yet, the kernel shouldn't call it again. test2 tests this.

这里我一开始有点没理解。后来发现是当一个中断的handler_function没被执行完时，不可以再被调用。相当于加了把锁。可以给进程添加一个flag，在sigreturn和usertrap中完成flag的互斥操作。

但是不能再sigalarm中初始化flag，因为sigalarm是可重入的。我们用flag的目的是为了维护好count的值，而sigalarm是中断一开始就调用的，sigalarm调用结束后才返回到periodic进行count++。我们要在count++彻底完成后才解开flag设置的锁，故在sigreturn初始化flag。

proc.h的修改如下

struct proc {
  struct spinlock lock;

  // p->lock must be held when using these:
  enum procstate state;        // Process state
  struct proc *parent;         // Parent process
  void *chan;                  // If non-zero, sleeping on chan
  int killed;                  // If non-zero, have been killed
  int xstate;                  // Exit status to be returned to parent's wait
  int pid;                     // Process ID

  // these are private to the process, so p->lock need not be held.
  uint64 kstack;               // Virtual address of kernel stack
  uint64 sz;                   // Size of process memory (bytes)
  pagetable_t pagetable;       // User page table
  struct trapframe *trapframe; // data page for trampoline.S
  struct context context;      // swtch() here to run process
  struct file *ofile[NOFILE];  // Open files
  struct inode *cwd;           // Current directory
  char name[16];               // Process name (debugging)

  int time_interval;                    //alarm interval
  void (*handler_function)();           //指向处理函数的指针
  int ticks_passed;                     //上次调用警报处理程序（alarm handler）以来经过了多少个 tick

  struct trapframe *new_trapframe;      //建立一个新的栈帧来维护寄存器的值
  uint64 re_epc;                        //原本应该返回到epc+4
  uint64 s0;
  uint64 s1;
  uint64 sp;
  uint64 ra;
  uint64 a1,a0,a5;                   //test1和foo汇编代码中涉及计算i，j的部分
  int flag;                          //flag=0的时候说明可以调用handler_function   完成可重入性检查
};

usertrap()修改如下

  if(p->killed)
    exit(-1);

  // give up the CPU if this is a timer interrupt.
  // if(which_dev == 2)
  //   yield();

  //记录ticks_passed
  if(which_dev==2){
    //如果是timer interrupt才处理进程的alarmtest
    // printf("The ticks_passed is: %d\n",p->ticks_passed);
    p->ticks_passed++;
    // if(p->time_interval==p->ticks_passed)
    //   printf("The handler_function address from user space is: %p\n",p->handler_function);
    if(p->time_interval==p->ticks_passed && p->flag==0){
      // sys_sigalarm();      //西巴！问题原来出在这里

      p->re_epc=p->trapframe->epc;      //存储原本应该返回到的地址(ecall的后一句指令)
      p->s0=p->trapframe->s0;
      p->s1=p->trapframe->s1;
      p->ra=p->trapframe->ra;
      p->sp=p->trapframe->sp;
      p->ticks_passed=0;
      p->a1=p->trapframe->a1;
      p->a0=p->trapframe->a0;
      p->a5=p->trapframe->a5;


      p->trapframe->epc=(uint64)(p->handler_function);

      p->flag=1;
    }
    yield();
  }

  usertrapret();

sys_sigalarm()如下

uint64
sys_sigalarm(void){
  struct proc *p=myproc();
  uint64 handler_function;
  if(argint(0,&(p->time_interval))<0)
    return -1;
  if(argaddr(1,&handler_function)<0)
    return -1;
  if((p->time_interval)==0 && handler_function==0)
    return -1;
  p->handler_function=(void*)handler_function; 
  p->ticks_passed=0;
  // p->ticks_passed=sys_uptime();
  printf("The time interval from user space is: %d\n",p->time_interval);
  printf("The handler_function address from user space is: %p\n",p->handler_function);
  return 0;
}

sys_sigreturn()如下

uint64
sys_sigreturn(void){
  struct proc *p=myproc();
  p->trapframe->epc=p->re_epc;      //存储原本应该返回到的地址(ecall的后一句指令)
  p->trapframe->s0=p->s0;
  p->trapframe->s1=p->s1;
  p->trapframe->ra=p->ra;
  p->trapframe->sp=p->sp;  
  p->trapframe->a1=p->a1;
  p->trapframe->a0=p->a0;
  p->trapframe->a5=p->a5;

  p->flag=0;
  return 0;
}