Xv6 and Unix utilities

qd89zzx

已于 2022-10-25 07:48:40 修改

阅读量369

点赞数 2

文章标签： unix linux ubuntu

于 2022-10-20 11:27:19 首次发布

本文链接：https://blog.csdn.net/qd89zzx/article/details/127423648

版权

背景

网上对于MIT 6.S081操作系统的课程评价挺高的，特别是它的lab实验部分。所以打算花点时间做一下lab实验，分几篇博客记录解题过程。本篇是对第一章的实验Xv6 and Unix utilities的归纳总结。先附上实验原题：
https://pdos.csail.mit.edu/6.828/2020/labs/util.html

Boot xv6 (easy)

第一个实验不需要编码，是为了让你将xv6这个类unix操作系统启动起来。windows环境下，需要使用vmware安装ubuntu 20.04的虚机。刚好前一篇博客做了内核调试环境的内容，所以ubuntu的环境都是现成的。
环境准备好后，执行make qemu可以使用qemu创建一个xv6的虚机。后续的实验代码编译成功后，都是在这个xv6的虚机内运行测试的。
xv6没有实现ps命令，所以查看进程使用ctrl+p
退出出xv6虚机，使用ctrl+a x，先同时按照ctrl+a，松手后，按一下x。

sleep (easy)

实验描述及提示

Implement the UNIX program sleep for xv6; your sleep should pause for a user-specified number of ticks. A tick is a notion of time defined by the xv6 kernel, namely the time between two interrupts from the timer chip. Your solution should be in the file user/sleep.c.

Some hints:
 - Before you start coding, read Chapter 1 of the xv6 book.
 - Look at some of the other programs in user/ (e.g., user/echo.c,
   user/grep.c, and user/rm.c) to see how you can obtain the
   command-line arguments passed to a program.
 - If the user forgets to pass an argument, sleep should print an error
   message.
 - The command-line argument is passed as a string; you can convert it
   to an integer using atoi (see user/ulib.c).
 - Use the system call sleep.
 - See kernel/sysproc.c for the xv6 kernel code that implements the
   sleep system call (look for sys_sleep), user/user.h for the C
   definition of sleep callable from a user program, and user/usys.S for
   the assembler code that jumps from user code into the kernel for
   sleep.
 - Make sure main calls exit() in order to exit your program.
 - Add your sleep program to UPROGS in Makefile; once you've done that,
   make qemu will compile your program and you'll be able to run it from
   the xv6 shell.
 - Look at Kernighan and Ritchie's book The C programming language
   (second edition) (K&R) to learn about C.

这是本章的第二个实验，也是第一个需要编码的实验。实现很简单，目的应该主要是为了熟悉实验环境的编译和测试的过程。每个实验，都需要仔细的阅读hints里面的每一条。hints里面的内容有的是对实验有帮助的提示，有的则是grade脚本的考察点。比如本实验hints的第二条是提示，第三条则是grade打分脚本的考察内容。

题解

代码流程如下：

从shell接收用户输入
对非法输入进行处理
使用用户传入的参数调用系统调用sleep。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int
main(int argc, char *argv[])
{
  int ticks;
  if(argc <= 1){
    fprintf(2, "usage: sleep time\n");
    exit(1);
  }
  ticks = atoi(argv[1]);
  if(ticks == 0){
    fprintf(2, "param must be a number and bigger than 0\n");
    exit(1);
  }
  sleep(ticks);
  exit(0);
}

pingpong (easy)

实验描述及提示

Write a program that uses UNIX system calls to ''ping-pong'' a byte between two processes over a pair of pipes, one for each direction. The parent should send a byte to the child; the child should print "<pid>: received ping", where <pid> is its process ID, write the byte on the pipe to the parent, and exit; the parent should read the byte from the child, print "<pid>: received pong", and exit. Your solution should be in the file user/pingpong.c.

Some hints:

 - Use pipe to create a pipe.
 - Use fork to create a child.
 - Use read to read from the pipe, and write to write to the pipe.
 - Use getpid to find the process ID of the calling process.
 - Add the program to UPROGS in Makefile.
 - User programs on xv6 have a limited set of library functions
   available to them. You can see the list in user/user.h; the source
   (other than for system calls) is in user/ulib.c, user/printf.c, and
   user/umalloc.c.

通过这个实验可以熟悉fork子进程的建立已经pipe管道的使用。还可以了解到pipe相当于golang里面的无缓冲channel，读写不同步是阻塞的。利用这一特性，可以进行读写的同步。

题解

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int
main(int argc, char *argv[])
{
  int p1[2],p2[2];
  char buf[10];
  pipe(p1);
  pipe(p2);
  int pid;
  if(fork()==0){
    //子进程
    pid = getpid();
    //通过管道p1[0]读端，进行数据读取，如果没有数据写入会阻塞。
    read(p1[0],buf,1);
    fprintf(1, "%d: received ping\n", pid);
    //通过管道p2[1]写端，进行数据写入
    write(p2[1],"a",1);
    exit(0);
  }else{
    //父进程
    pid = getpid();
    //通过管道p1[1]写端，进行数据写入，如果读端没有读取会阻塞。
    write(p1[1],"a",1);
    //通过管道p2[0]读端，进行数据读取
    read(p2[0],buf,1);
    fprintf(1, "%d: received pong\n", pid);
    wait((int *)0);
  }

  exit(0);
}

primes (moderate)/(hard)

实验描述及提示

Write a concurrent version of prime sieve using pipes. This idea is due to Doug McIlroy, inventor of Unix pipes. The picture halfway down this page and the surrounding text explain how to do it. Your solution should be in the file user/primes.c.

Your goal is to use pipe and fork to set up the pipeline. The first process feeds the numbers 2 through 35 into the pipeline. For each prime number, you will arrange to create one process that reads from its left neighbor over a pipe and writes to its right neighbor over another pipe. Since xv6 has limited number of file descriptors and processes, the first process can stop at 35.

Some hints:

 - Be careful to close file descriptors that a process doesn't need,
   because otherwise your program will run xv6 out of resources before
   the first process reaches 35.
 - Once the first process reaches 35, it should wait until the entire
   pipeline terminates, including all children, grandchildren, &c. Thus
   the main primes process should only exit after all the output has
   been printed, and after all the other primes processes have exited.
 - Hint: read returns zero when the write-side of a pipe is closed.
 - It's simplest to directly write 32-bit (4-byte) ints to the pipes,
   rather than using formatted ASCII I/O.
 - You should create the processes in the pipeline only as they are
   needed.
 - Add the program to UPROGS in Makefile.

本实验要求通过开子进程，打印35以内的质数。还是通过fork/pipe来实现的，主要是借鉴了CSP的论文来实现的。这篇论文貌似也是golang携程实现的依据。这个实验难度是(moderate)/(hard)也就是需要2个小时甚至超过两个小时的时间来完成。但是给我的感觉实际难度没有那么大，因为这个实验的难度主要是集中在算法层面的。而算法论文中已经提供了伪代码。所以，仅需要通过代码实现这个算法就可以。代码实现就是pipe + fork还有函数递归的知识。下面贴一下论文里，关于本题的解题算法。
primes
###题解

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

void
primes(int fd){
  int buf,ret;
  int prime, temp;
  int p[2];
  //从父进程也就是left neighbor里读取数据
  ret = read(fd, &buf, sizeof(int));
  if(ret == 0){
    goto exit;
  }
  //读取到的第一个数，肯定是质数，直接打印。
  prime = buf;
  fprintf(1, "prime %d\n", prime);
  //创建管道
  pipe(p);
  if(fork()==0){
    //创建子进程，并关闭管道写端，只保留父进程一个写端
    close(p[1]);
    //递归调用primes函数
    primes(p[0]);
    exit(0);
  }else{
    //循环从父进程取数据
    while(read(fd, &buf, sizeof(int))){
      temp = buf;
      //如果数据能被当前进程打印的质数整除，则丢弃。如果不能整除，则传给子进程也就是right neighbor继续处理。
      if(temp % prime != 0){
        write(p[1], &temp,sizeof(int));
      }
    }
  }
exit:
      close(fd);
      close(p[1]);
      wait((int*)0);
}

int
main(int argc, char *argv[])
{
  int i;
  int p[2];
  //创建管道，用于向子进程写数据
  pipe(p);
  if(fork()==0){
    //子进程关闭管道写端，只保留父进程一个写端。
    close(p[1]);
    //子进程调用primes函数
    primes(p[0]);
    exit(0);
  }else{
    //父进程，从2也是就是第一个质数，开始循环往自己创建的子进程也就是right neighbor里写数据
    for(i = 2; i <= 35; i++){
      write(p[1],&i, sizeof(int));
    }
    close(p[1]);
    wait((int *)0);
  }

  exit(0);
}

find (moderate)

实验描述及提示

Write a simple version of the UNIX find program: find all the files in a directory tree with a specific name. Your solution should be in the file user/find.c.

Some hints:

 - Look at user/ls.c to see how to read directories.
 - Use recursion to allow find to descend into sub-directories.
 - Don't recurse into "." and "..".
 - Changes to the file system persist across runs of qemu; to get a
   clean file system run make clean and then make qemu.
 - You'll need to use C strings. Have a look at K&R (the C book), for
   example Section 5.5.
 - Note that == does not compare strings like in Python. Use strcmp()
   instead.
 - Add the program to UPROGS in Makefile.

Your solution is correct if produces the following output (when the file system contains the files b and a/b):

    $ make qemu
    ...
    init: starting sh
    $ echo > b
    $ mkdir a
    $ echo > a/b
    $ find . b
    ./b
    ./a/b
    $

这个实验和下一个实验难度都是moderate，理论上来说应该比上一个打印质数的实验难度要低一些。但是我在后面两个实验上面花的时间明显要多一些。主要是因为，最后两个实验都涉及到了字符串相关的处理，以及内存的动态分配。对于不写c代码的，这一块的相关操作都要现查，实现起来比较费劲。

题解

这个实验的思路还是比较简单的，

输入查找的目录x，和查找的目标文件y
判断x是否为文件，如果为文件跳到3，如果为目录跳到4
判断x与y名称是否相等，相等则打印
x为目录，open x，对x内的每一个文件/目录，递归调用当前流程。既跳到1

#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
#include "kernel/fs.h"
//工具函数，用于拿到不带路径的文件名，既输入/a/b，输出b
char*
fmtname(char *path)
{
  static char buf[DIRSIZ+1];
  char *p;

  // Find first character after last slash.
  for(p=path+strlen(path); p >= path && *p != '/'; p--)
    ;
  p++;

  // Return blank-padded name.
  if(strlen(p) >= DIRSIZ)
    return p;
  memmove(buf, p, strlen(p));
  memset(buf+strlen(p), 0, DIRSIZ-strlen(p));
  return buf;
}



void
find(char *path, char *file)
{
  char buf[128], *p;
  int fd;
  struct dirent de;
  struct stat st;

  //stat函数拿到path类型
  if(stat(path, &st) <0 ){
    fprintf(2, "find: cannot stat %s\n", path);
    return;
  }
  switch(st.type){
    //类型为文件，与目标文件名进行对比
    case T_FILE:
      if(!strcmp(fmtname(path), file)){
        fprintf(1, "%s\n", path);
      }
      break;
    //类型为目录，打开目录，对目录内的每一个文件递归调用find
    case T_DIR:
      if(strlen(path) + 1 + DIRSIZ + 1 > sizeof buf){
        printf("find: path too long\n");
        break;
      }
      if((fd = open(path, 0)) < 0){
        fprintf(2, "find: cannot open %s\n", path);
        return;
      }
      strcpy(buf, path);
      p = buf+strlen(buf);
      *p++ = '/';
      while(read(fd, &de, sizeof(de)) == sizeof(de)){
        //de.name[0] == . 过滤掉当前目录.,以及上一级目录..。
        //解决方法比较暴力，也会将隐藏文件过滤掉。实验环境中没有隐藏文件，所以不影响最终评判
        if(de.inum == 0 || de.name[0] == '.')
          continue;
        memmove(p, de.name, DIRSIZ);
        p[DIRSIZ] = 0;
        find(buf, file);
      }
      close(fd);
      break;
  }

}

int
main(int argc, char *argv[])
{
  if(argc < 3){
    fprintf(2, "find: usage……\n");
    exit(1);
  }
  find(argv[1],argv[2]);
  exit(0);
}

xargs (moderate)

实验描述及提示

Write a simple version of the UNIX xargs program: read lines from the standard input and run a command for each line, supplying the line as arguments to the command. Your solution should be in the file user/xargs.c.

The following example illustrates xarg's behavior:
    $ echo hello too | xargs echo bye
    bye hello too
    $

Some hints:
 - Use fork and exec to invoke the command on each line of input. Use
   wait in the parent to wait for the child to complete the command.
 - To read individual lines of input, read a character at a time until a
   newline ('\n') appears. kernel/param.h declares MAXARG, which may be
   useful if you need to declare an argv array.
 - Add the program to UPROGS in Makefile.
 - Changes to the file system persist across runs of qemu; to get a
   clean file system run make clean and then make qemu.

xargs, find, and grep combine well:
  $ find . b | xargs grep hello
  
will run "grep hello" on each file named b in the directories below ".".
To test your solution for xargs, run the shell script xargstest.sh. Your solution is correct if it produces the following output:
  $ make qemu
  ...
  init: starting sh
  $ sh < xargstest.sh
  $ $ $ $ $ $ hello
  hello
  hello
  $ $

和上一个实验一样，设计到内存分配，以及字符串的处理。特别是我的实现方式里，使用了字符串切割，但是xv6环境中没有现成的函数可以用，需要自己实现，费劲。

题解

解决方案比较粗暴，顺着思路捋下来的。也没想过是否有其他好的实现方式。

首先接收xargs后面的命令，存在main函数 argv传参内。
读取fd 0标准输入，接收管道符|之前的命令的输出，与xargs进行拼接。
管道符|之前的命令的输出以’\n’换行符为分隔符。有几个换行符，就应该与argv内的参数拼接几次。
fork开子进程，exec执行拼接后的命令。

#include "kernel/types.h"
#include "kernel/stat.h"
#include "kernel/param.h"
#include "user/user.h"

int
main(int argc, char *argv[])
{
  int i, n;
  char *p1, *p2;
  char buf[512], buf_child[256];
  //创建argv数组，组合管道传过来的输出，以及xargs后的输入
  char* argvs[MAXARG];
  if(argc < 2){
    fprintf(2, "usage:  xargs\n");
    exit(1);
  }
  //将argv内的传参，copy至argvs数组内，去掉xargs。
  for(i = 0; i < argc-1; i++){
    argvs[i] = malloc(sizeof(char) * strlen(argv[i+1]));
    memmove(argvs[i], argv[i+1], strlen(argv[i+1]));
  }
  p1 = buf;
  //循环读取标准输入0的内容，也就是管道符|之前的输出。
  while((n = read(0,p1,511))!=0){
    //p1作为滑动指针，每次指向buf读取后的最后一个字节。
    p1 = p1 + n;
  }
  //再次将p1指针指向buf头。
  p1 = buf;
  //使用换行符切割buf(也就是管道符|之前命令的输出)
  while((p2 = strchr(p1,'\n'))!= (char*)0){
    //p2的位置为换行符所在位置，将p2位置存的内容替换为字符串结尾
    *p2 = 0;
    //开子进程
    if(fork()==0){
      //注意子进程里内存空间已经和父进程不同了，所以在子进程里修改的p1、p2的变量不会影响父进程以及其他子进程的变量值。
      //p1指向的是使用'\n'换行符切割后的每一个字符串。将p1的字符串copy给buf_child
      memmove(buf_child, p1, strlen(p1));
      //将p1指向buf_child头
      p1 = buf_child;
      //以空格为分隔符，分割字符串，将每一个字符串添加至argvs数组内。同样的，修改argvs的数据，也不影响父进程以及其他子进程的里argvs变量的值。
      while((p2 = strchr(p1,' ')) != (char*)0){
        *p2 = 0;
        argvs[i] = malloc(sizeof(char) * strlen(p1));
        memmove(argvs[i], p1, strlen(p1));
        p1 = p2 + 1;
        i++;
      }
      //这里是将最后一个空格之后的字符串，添加至argvs里。
      argvs[i] = malloc(sizeof(char) * strlen(p1));
      memmove(argvs[i], p1, strlen(p1));
      //exec执行命令
      if(exec(argvs[0],argvs)){
        fprintf(2,"exec %d error", argvs[0]);
        exit(1);
      }
      exit(0);
    }else{
      p1 = p2 + 1;
      wait((int*)0);
    }
  }
  exit(0);
}