CSAPP | Lab2-Bomb Lab详细解析

OrangeEarth

于 2024-04-24 13:53:42 发布

阅读量1.9k

点赞数 58

分类专栏： CSAPP 文章标签： CSAPP c语言深入理解计算机系统汇编语言 Bomb Lab gdb

本文链接：https://blog.csdn.net/wangs29/article/details/138156587

版权

CSAPP 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

预备阶段

1.Lab要求

邪恶的邪恶博士在我们班的机器上安放了大量的“二元炸弹”。二进制炸弹是一个由一系列阶段组成的程序。每个阶段都要求你在 stdin 上键入一个特定的字符串。如果你键入了正确的字符串，那么这个阶段就会被拆除，炸弹就会进入下一个阶段。否则，炸弹会通过打印“ BOOM 然后终止。当每个阶段都被拆除时，炸弹就会被拆除。
我们要处理的炸弹太多了，所以我们给每个学生一个炸弹去拆除。你们的任务，你们别无选择，只能接受，就是在due date前拆除炸弹。祝你好运，欢迎加入拆弹小组！

Step1：了解bomb.tar文件

使用以下命令解压获取的压缩包: tar-xvf bomk.tar，这将创建一个名为./bomb的文件夹，包含以下文件:

• README

• bomb: The executable binary bomb.

• bomb.c: Source file with the bomb’s main routine and a friendly greeting from Dr.Evil.

Step 2: 拆除炸弹

你在这个实验室的工作就是拆除炸弹。你必须在其中一台机器上完成作业。事实上，有传言说邪恶博士真的很邪恶，如果把炸弹运到别处，它总是会爆炸。据我们所知，炸弹里还有其他一些防篡改装置。

你可以使用许多工具来帮助你拆除炸弹。请看提示部分获取一些提示和idea。最好的方法是使用调试器逐步通过反汇编的二进制文件拆除炸弹。

每次你的炸弹爆炸，它通知炸弹服务器，你失去了 1/2 点(最多 20 点)的最终分数。所以引爆炸弹是有后果的。你一定要小心！

前四个阶段各值10 分。第五阶段和第六阶段比较困难，所以每个阶段都值15 分。所以你能得到的最高分是70 分。

虽然每个阶段炸弹变得越来越难拆除，但是你从一个phase到另一个phase所获得的经验应该可以缓解这个困难。然而，最后一个phase甚至会难倒最优秀的学生，所以请不要等到最后一分钟才开始。

炸弹会忽略空白输入行。例如，如果您使用命令行参数运行炸弹,

linux> ./bomb psol.txt 
linux > ./bomb psol.txt

然后它将从psol.txt读取输入行，直到达到 EOF (文件结束) ，然后切换到 stdin，这样你就不必一直重复输入你已经拆除的阶段的解决方案。

为了避免意外引爆炸弹，你需要学习如何单步通过汇编代码和如何设置断点。你还需要学习如何检查寄存器和内存状态。做实验的一个好的副作用就是你会非常擅长使用调试器。这是一项至关
重要的技能，将在你的职业生涯中带来丰厚的回报。

2.gdb调试器基本操作

下面的内容来自csapp官网所放出来的文件中。

Summary of GDB commands for x86-64 Systems

Starting:

gdb
gdb <file>

Running and stopping

Command	Effect
quit	Exit gdb
run	Run program
run 1 2 3	Run program with command-line arguments 1 2 3
kill	Stop the program
Ctrl-d	Exit gdb

Note: Ctrl-C does not exit from gdb, but halts the current gdb command

Breakpoints

Command	Effect
break sum	Set breakpoint at the entry to function sum
break *0x80483c3	Set breakpoint at address 0x80483c3
delete 1	Delete breakpoint 1
disable 1	Disable the breakpoint 1(gdb numbers each breakpoint you create)
enable 1	Enable breakpoint 1
delete	Delete all breakpoints
clear sum	Clear any breakpoints at the entry to function sum

Execution

Command	Effect
stepi	Execute one instruction
stepi 4	Execute four instructions
nexti	Like stepi, but proceed through function calls without stopping
step	Execute one C statement
continue	Resume execution until the next breakpoint
until 3	Continue executing until program hits breakpoint 3
finish	Resume execution until current function returns
call sum(1, 2)	Call sum(1,2) and print return value Examining code
disas	Disassemble current function
disas sum	Disassemble function sum
disas 0x80483b7	Disassemble function around 0x80483b7
disas 0x80483b7 0x80483c7	Disassemble code within specified address range
print /x $rip	Print program counter in hex
print /d $rip	Print program counter in decimal
print /t $rip	Print program counter in binary

Examining data

Command	Effect
print /d $rax	Print contents of %rax in decimal
print /x $rax	Print contents of %rax in hex
print /t $rax	Print contents of %rax in binary
print /d (int)$rax	Print contents of %rax in decimal after sign-extending lower 32-bits.
print 0x100	Print decimal representation of 0x100
print /x 555	Print hex representation of 555
print /x ($rsp+8)	Print (contents of %rsp) + 8 in hex
print (int ) 0xbffff890	Print integer at address 0xbffff890
print (int ) ($rsp+8)	Print integer at address %rsp + 8
print (char *) 0xbfff890	Examine a string stored at 0xbffff890
x/w 0xbffff890	Examine (4-byte) word starting at address 0xbffff890
x/w $rsp	Examine (4-byte) word starting at address in $rsp
x/wd $rsp	Examine (4-byte) word starting at address in $rsp. Print in decimal
x/2w $rsp	Examine two (4-byte) words starting at address in $rsp
x/2wd $rsp	Examine two (4-byte) words starting at address in $rsp. Print in decimal
x/g $rsp	Examine (8-byte) word starting at address in $rsp.
x/gd $rsp	Examine (8-byte) word starting at address in $rsp. Print in decimal
x/a $rsp	Examine address in $rsp. Print as offset from previous global symbol.
x/s 0xbffff890	Examine a string stored at 0xbffff890
x/20b sum	Examine first 20 opcode bytes of function sum
x/10i sum	Examine first 10 instructions of function sum

(Note: the format string for the ‘x’ command has the general form 
 x/[NUM][SIZE][FORMAT] where

NUM  = number of objects to display
SIZE = size of each object (b=byte, h=half-word, w=word,
                            g=giant (quad-word))
FORMAT = how to display each object (d=decimal, x=hex, o=octal, etc.)
If you don’t specify SIZE or FORMAT, either a default value, or the last
value you specified in a previous ‘print’ or ‘x’ command is used.
)

You need print /d (int)$rax to print 32-bit, negative numbers stored in the lower 32 bits of %rax. For example, if the lower 32-bits of %rax store 0xffffffff, you will see 
(gdb) print $rax
$1 = 4294967295
(gdb) print (int)$rax
$2 = -1
(gdb)

Useful information

Command	Effect
backtrace	Print the current address and stack backtrace
where	Print the current address and stack backtrace
info program	Print current status of the program
info functions	Print functions in program
info stack	Print backtrace of the stack
info frame	Print information about the current stack frame
info registers	Print registers and their contents
info breakpoints	Print status of user-settable breakpoints
display /FMT EXPR	Print expression EXPR using format FMT every time GDB stops
undisplay	Turn off display mode
help	Get information about gdb

3.bomb.c文件解析

#include <stdio.h>
#include <stdlib.h>
#include "support.h"
#include "phases.h"

/* 
 * Note to self: Remember to erase this file so my victims will have no
 * idea what is going on, and so they will all blow up in a
 * spectaculary fiendish explosion. -- Dr. Evil 
 */

FILE *infile;

int main(int argc, char *argv[])
{
    char *input;

    /* Note to self: remember to port this bomb to Windows and put a 
     * fantastic GUI on it. */

    /* When run with no arguments, the bomb reads its input lines 
     * from standard input. */
    if (argc == 1) {  
	    infile = stdin;
    } 

    /* When run with one argument <file>, the bomb reads from <file> 
     * until EOF, and then switches to standard input. Thus, as you 
     * defuse each phase, you can add its defusing string to <file> and
     * avoid having to retype it. */
    else if (argc == 2) {
	    if (!(infile = fopen(argv[1], "r"))) {
	      printf("%s: Error: Couldn't open %s\n", argv[0], argv[1]);
	      exit(8);
	    }
    }

    /* You can't call the bomb with more than 1 command line argument. */
    else {
	    printf("Usage: %s [<input_file>]\n", argv[0]);
	      exit(8);
    }

    /* Do all sorts of secret stuff that makes the bomb harder to defuse. */
    initialize_bomb();

    printf("Welcome to my fiendish little bomb. You have 6 phases with\n");
    printf("which to blow yourself up. Have a nice day!\n");

    /* Hmm...  Six phases must be more secure than one phase! */
    input = read_line();             /* Get input                   */
    phase_1(input);                  /* Run the phase               */
    phase_defused();                 /* Drat!  They figured it out!
				      * Let me know how they did it. */
    printf("Phase 1 defused. How about the next one?\n");

    /* The second phase is harder.  No one will ever figure out
     * how to defuse this... */
    input = read_line();
    phase_2(input);
    phase_defused();
    printf("That's number 2.  Keep going!\n");

    /* I guess this is too easy so far.  Some more complex code will
     * confuse people. */
    input = read_line();
    phase_3(input);
    phase_defused();
    printf("Halfway there!\n");

    /* Oh yeah?  Well, how good is your math?  Try on this saucy problem! */
    input = read_line();
    phase_4(input);
    phase_defused();
    printf("So you got that one.  Try this one.\n");
    
    /* Round and 'round in memory we go, where we stop, the bomb blows! */
    input = read_line();
    phase_5(input);
    phase_defused();
    printf("Good work!  On to the next...\n");

    /* This phase will never be used, since no one will get past the
     * earlier ones.  But just in case, make this one extra hard. */
    input = read_line();
    phase_6(input);
    phase_defused();

    /* Wow, they got it!  But isn't something... missing?  Perhaps
     * something they overlooked?  Mua ha ha ha ha! */
    
    return 0;
}

拆除炸弹阶段

首先我们使用gdb命令进入调试模式：
在这里插入图片描述

Phase_1

首先我们通过disas phase_1命令反汇编phase_1函数得到如下结果：

在这里插入图片描述
下面我们对重要部分进行解析：

首先，因为有call 0x401338 <strings_not_equal>要调用string_not_equal函数，这一块很容易猜到就是看字符串是否相同，相同的话寄存器%rax返回0，否则返回1,而通过后面的逻辑我们知道只有这两个字符串相同时才能跳过炸弹。
那么传入strings_not_equal函数的参数是什么呢，一般来说是指向两个字符串的指针，参数1寄存器%rdi就是我们的输入字符串，参数二寄存器就是%rsi，mov $0x402400,%esi说明我们将地址$0x402400存入%esi中，那么地址$0x402400处的字符串就是我们索要输入的字符串。

使用x/s 0x402400命令查看此处的字符串：

在这里插入图片描述

因此phase_1最后的答案就是：Border relations with Canada have never been better.

Phase_2

同样我们通过disas phase_2命令反汇编phase_2函数得到如下结果：

在这里插入图片描述

可以看到在+9位置调用了read_six_numbers函数，反汇编这个函数可以得到：

在这里插入图片描述

前两行保存一下被调用者保存寄存器，然后将栈指针%rsp减小$0x28，相当于在栈里面预留出来了28个字节的空间，现在先画出栈帧，用十六进制数简单表示相对于%rsp的偏移字节。

在这里插入图片描述

然后将栈指针%rsp移入%esi，调用函数read_six_numbers。

read_six_numbers

首先明确函数参数，第一个参数通过寄存器%rdi传入，也就是我们的输入字符串，第二个参数存在%rsi中，通过前面描述可知就是上面所画phase_2函数的栈指针

简单分析该函数逻辑把mov $0x4025c3,%esi之前的转化为c风格混合汇编代码：

  int *rdx = rsi;     //%rdx存放图中%rsp处地址
  int *rcx = rsi + 4; //其实也就是%rcx存放图中0x4处地址
  int *rax = rsi + 0x14; //其实也就是%rax存放图中0x14处地址
  (%rsp + 8) = rax; //当前栈指针%rsp所指地址 + 8处存放图中0x14处地址
  rax = rsi + 0x10;//%rax存放图中0x10处地址
  (%rsp) = rax;//当前栈指针%rsp所指地址处存放图中0x10处地址
  int *r9 = rsi + 0xc;
  int *r8 = rsi + 0x8;

上面做的事情其实很简单，就是把上面phase_2栈帧%rsp，%rsp+4，%rsp+8，…%rsp+0x14分别存在%rdx，%rcx, %r8, %r9（后四个参数传递寄存器）和’read_six_number’的栈帧顶上两个位置。
接下来使用<__isoc99_sscanf@plt>函数读取数据，第一个参数%rdi存放我们输入字符串，第二个参数%rsi保存地址$0x4025c3，用命令看一下里面的内容如下：

在这里插入图片描述

再根据后面的比较返回值cmp $0x5,%eax操作大致可以理解我们需要输入6个整型数据，分别将他们存在phase_2栈帧%rsp，%rsp+4，%rsp+8，…%rsp+0x14的位置。

分析完read_six_numbers函数后现在我们返回主函数。

phase_2主函数

   0x0000000000400f0a <+14>:    cmpl   $0x1,(%rsp)
   0x0000000000400f0e <+18>:    je     0x400f30 <phase_2+52>
   0x0000000000400f10 <+20>:    call   0x40143a <explode_bomb>

这一段说明第一个参数一定是1，否则会<explode_bomb>，因此arg1 = 1

接下来跳转到+52处：


   0x0000000000400f30 <+52>:    lea    0x4(%rsp),%rbx
   0x0000000000400f35 <+57>:    lea    0x18(%rsp),%rbp
   0x0000000000400f3a <+62>:    jmp    0x400f17 <phase_2+27>

%rbx存放%rsp + 0x4的地址， %rbp存放%rsp + 0x18，猜测%rbp可能起一个终止指针的作用,然后跳到+27，应该是要开始循环了

   0x0000000000400f17 <+27>:    mov    -0x4(%rbx),%eax
   0x0000000000400f1a <+30>:    add    %eax,%eax
   0x0000000000400f1c <+32>:    cmp    %eax,(%rbx)
   0x0000000000400f1e <+34>:    je     0x400f25 <phase_2+41>
   0x0000000000400f20 <+36>:    call   0x40143a <explode_bomb>
   0x0000000000400f25 <+41>:    add    $0x4,%rbx
   0x0000000000400f29 <+45>:    cmp    %rbp,%rbx
   0x0000000000400f2c <+48>:    jne    0x400f17 <phase_2+27>
   0x0000000000400f2e <+50>:    jmp    0x400f3c <phase_2+64>

最开始%rbx存放%rsp + 0x4的地址，首先将%rbx - 0x4(也就是%rsp)指向的内容放入%eax，通过上面的分析知道也就是arg1 = 1，%eax存放1
接下来add %eax,%eax，%eax存放2
比较%rsp + 0x4指向的内容和%eax指向的内容（也就是arg2与2*arg1），只有二者相等才能跳过炸弹，于是arg2 = 2
接下来add $0x4,%rbx并进行终止条件的比较cmp %rbp,%rbx开启循环，循环到最后：

arg1 = 1
arg2 = 2 * arg1 = 2
arg3 = 2 * arg2 = 4
arg4 = 2 * arg3 = 8
arg5 = 2 * arg4 = 16
arg6 = 2 * arg5 = 32

因此phase_2的最终答案为：1 2 4 8 16 32

Phase_3

同样我们通过disas phase_3命令反汇编phase_3函数得到如下结果：

在这里插入图片描述

还是先大概画出栈帧：

在这里插入图片描述

下面来一步步分析，第一部分和phase_2里read_six_numbers函数思路极为相像：

   0x0000000000400f43 <+0>:     sub    $0x18,%rsp
   0x0000000000400f47 <+4>:     lea    0xc(%rsp),%rcx
   0x0000000000400f4c <+9>:     lea    0x8(%rsp),%rdx
   0x0000000000400f51 <+14>:    mov    $0x4025cf,%esi
   0x0000000000400f56 <+19>:    mov    $0x0,%eax
   0x0000000000400f5b <+24>:    call   0x400bf0 <__isoc99_sscanf@plt>
   0x0000000000400f60 <+29>:    cmp    $0x1,%eax
   0x0000000000400f63 <+32>:    jg     0x400f6a <phase_3+39>
   0x0000000000400f65 <+34>:    call   0x40143a <explode_bomb>

%esi所存地址的内容如下：

在这里插入图片描述

因此我们知道上面一块的作用实际上是指出我们的输入是两个整数，并且分别存在栈帧的标号0x8和0xc的位置。

接下来继续分析：

   0x0000000000400f6a <+39>:    cmpl   $0x7,0x8(%rsp)
   0x0000000000400f6f <+44>:    ja     0x400fad <phase_3+106>
   (+106处汇编指令call 0x40143a<explode_bomb>)
   0x0000000000400f71 <+46>:    mov    0x8(%rsp),%eax
   0x0000000000400f75 <+50>:    jmp    *0x402470(,%rax,8)

上面代码中+39位置的比较告诉我们0 <= arg1 <= 7（这是因为ja是无符号数的跳转指令），否则炸弹就要爆炸然后把arg1存入%eax中，并且跳转到地址0x402470 + 8 * %rax所存的的内容所表示的地址中，因为arg1的取值范围是0-7，因此我们的跳转地址有可能保存在0x402470-0x0x4024a8地址空间中，通过x/8gx指令查看可以得到如下结果：

在这里插入图片描述

接着看下面的代码：

   0x0000000000400f7c <+57>:    mov    $0xcf,%eax
   0x0000000000400f81 <+62>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400f83 <+64>:    mov    $0x2c3,%eax
   0x0000000000400f88 <+69>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400f8a <+71>:    mov    $0x100,%eax
   0x0000000000400f8f <+76>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400f91 <+78>:    mov    $0x185,%eax
   0x0000000000400f96 <+83>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400f98 <+85>:    mov    $0xce,%eax
   0x0000000000400f9d <+90>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400f9f <+92>:    mov    $0x2aa,%eax
   0x0000000000400fa4 <+97>:    jmp    0x400fbe <phase_3+123>
   0x0000000000400fa6 <+99>:    mov    $0x147,%eax
   0x0000000000400fab <+104>:   jmp    0x400fbe <phase_3+123>
   0x0000000000400fad <+106>:   call   0x40143a <explode_bomb>
   0x0000000000400fb2 <+111>:   mov    $0x0,%eax
   0x0000000000400fb7 <+116>:   jmp    0x400fbe <phase_3+123>
   0x0000000000400fb9 <+118>:   mov    $0x137,%eax
   0x0000000000400fbe <+123>:   cmp    0xc(%rsp),%eax
   0x0000000000400fc2 <+127>:   je     0x400fc9 <phase_3+134>
   0x0000000000400fc4 <+129>:   call   0x40143a <explode_bomb>

当arg1 = 0时，跳转到0x0000000000400f7c，%eax = 0xcf = 207
当arg1 = 1时，跳转到0x0000000000400fb9，%eax = 0x137 = 311
当arg1 = 2时，跳转到0x0000000000400f83，%eax = 0x2e3 = 707
当arg1 = 3时，跳转到0x0000000000400f8a，%eax = 0x100 = 256
当arg1 = 4时，跳转到0x0000000000400f91，%eax = 0x185 = 389
当arg1 = 5时，跳转到0x0000000000400f98，%eax = 0x137 = 311
当arg1 = 6时，跳转到0x0000000000400f9f，%eax = 0xce = 206
当arg1 = 7时，跳转到0x0000000000400fa6，%eax = 0x147 = 327

最后进行条件判断，arg2 = %eax时才能正常返回，所以本题有八种可能的情形，也就是有八个正确答案，任填一个即可。

Phase_4

同样我们通过disas phase_4命令反汇编phase_4函数得到如下结果：

在这里插入图片描述

+32之前的内容和phase_3完全一致，其实也就是我们的输入是两个整数，将两个整数分别存放在phase_4栈帧%rsp + 0x8和%rsp + 0xc指向的位置上。

   0x000000000040102e <+34>:    cmpl   $0xe,0x8(%rsp)
   0x0000000000401033 <+39>:    jbe    0x40103a <phase_4+46>
   0x0000000000401035 <+41>:    call   0x40143a <explode_bomb>

这一段逻辑也很简单，其实就是告诉我们arg1 <= 0xe = 14，否则炸弹就会爆炸，接下来跳到+46继续看：

   0x000000000040103a <+46>:    mov    $0xe,%edx
   0x000000000040103f <+51>:    mov    $0x0,%esi
   0x0000000000401044 <+56>:    mov    0x8(%rsp),%edi
   0x0000000000401048 <+60>:    call   0x400fce <func4>
   0x000000000040104d <+65>:    test   %eax,%eax
   0x000000000040104f <+67>:    jne    0x401058 <phase_4+76>
   0x0000000000401051 <+69>:    cmpl   $0x0,0xc(%rsp)
   0x0000000000401056 <+74>:    je     0x40105d <phase_4+81>
   0x0000000000401058 <+76>:    call   0x40143a <explode_bomb>
   0x000000000040105d <+81>:    add    $0x18,%rsp
   0x0000000000401061 <+85>:    ret

接下来主要是调用func4函数，第一个参数通过%rdi传递，其实也就是我们的输入的第一个参数arg1，第二个参数通过%rsi传递为0，第三个参数通过%rdx传递，为0xe = 14，再往后看一段就能发现我们要求返回值为0并且arg2也为0，接下来就是求出使func4函数返回值为0的arg1，反汇编func4函数如下：

在这里插入图片描述

将其转化为类似c风格的代码如下：

//esi = 0,    edx = 14   edi:arg1（小于等于14）
//要求返回值eax = 0;
int fun4(int edi, int esi, int edx) {
    int eax = edx - esi;    //eax = 14 - 0 = 14
    int ecx = eax >> 0x1f;  // ecx = rsult右移31位 = 0
    eax += ecx;             //eax = 14 + 0 = 14
    eax >>= 1;    //sal没有用cl寄存器或者立即数指定移位数那么移位数默认为1，因此eax = eax/2 = 7
    ecx = eax + esi; //ecx = 7 + 0 = 7
    if (edi >= ecx) //if(edi >= 7)
        goto L1; 
    edx = ecx - 1; //edx = 7 - 1 = 6
    eax = fun4(edi, esi, edx); //eax = fun4(edi, 0, 6)
    eax = eax * 2;
    goto done;  
L1:
    eax = 0;
    if (edi <= ecx) //if(edi <= 7)
        goto done;
    else {
        esi = ecx + 1;//esi = 7 + 1 = 8
        eax = fun4(edi, esi, edx);//eax = fun4(edi, 8, 14);
        eax = 2 * eax + 1;
    }
done:
    return eax;
}

通过这版代码我们只能看出来arg1 = 7的时候是满足题意的，这里我们再往深入去分析一下其他的可能取值，再增加抽象化改写一下上面的c风格代码：

int func4(int x, int a, int b) {
   int diff = b - a;
   diff = (diff + (diff >> 31)) / 2;
   int mid = a + diff;
   if (x < mid) 
      return 2 * func4(x, a, mid - 1);
   else if (x > mid)
      return 2 * func4(x, a, mid - 1) + 1;
   else return 0;
}

虽然其中有一些细节不解其意（对！说的就是右移31位的操作），但是好在已经能够比较清晰地看出来这是一个二分法的函数，当一直在区间左半边时返回0。

当 x = arg1 = 7是，第一次调用就返回0，这个也是我们一眼就看出来的答案
否则x < 7返回值才有可能为0，此时调用func4(arg1, 0, 6)，x = 3时直接返回0
否则x < 3返回值才有可能为0，此时调用func4(arg1, 0, 2)，x = 1时直接返回0
否则x < 1返回值才有可能为0，此时调用func4(arg1, 0, 0)，x = 0时返回0

综合上面说的，我们的最后答案应该有4种可能：

Phase_5

同样我们通过disas phase_5命令反汇编phase_5函数得到如下结果：
在这里插入图片描述

为了方便理解我们还是根据汇编代码先画一下栈帧：

在这里插入图片描述
然后继续分析接下来这一段：

   0x0000000000401078 <+22>:    xor    %eax,%eax //清零返回值寄存器%rax
   0x000000000040107a <+24>:    call   0x40131b <string_length>
   0x000000000040107f <+29>:    cmp    $0x6,%eax
   0x0000000000401082 <+32>:    je     0x4010d2 <phase_5+112>
   0x0000000000401084 <+34>:    call   0x40143a <explode_bomb>

将输入字符串作为参数调用<string_length>函数，如果返回值不等于6就会爆炸，因此可以判断我们的输入为一个长度为6的字符串，接下来跳转到+112处开始执行。

   0x0000000000401067 <+5>:     mov    %rdi,%rbx //这一段也贴上方便看%rbx的值

   0x00000000004010d2 <+112>:   mov    $0x0,%eax
   0x00000000004010d7 <+117>:   jmp    0x40108b <phase_5+41>

   0x000000000040108b <+41>:    movzbl (%rbx,%rax,1),%ecx
   0x000000000040108f <+45>:    mov    %cl,(%rsp)
   0x0000000000401092 <+48>:    mov    (%rsp),%rdx
   0x0000000000401096 <+52>:    and    $0xf,%edx
   0x0000000000401099 <+55>:    movzbl 0x4024b0(%rdx),%edx
   0x00000000004010a0 <+62>:    mov    %dl,0x10(%rsp,%rax,1)
   0x00000000004010a4 <+66>:    add    $0x1,%rax
   0x00000000004010a8 <+70>:    cmp    $0x6,%rax
   0x00000000004010ac <+74>:    jne    0x40108b <phase_5+41>
   0x00000000004010ae <+76>:    movb   $0x0,0x16(%rsp)

上面出现了关于操作地址0x4024b0附近的内容的代码，我们先贴出来以防后面要用到：

在这里插入图片描述

假设我们的输入字符串为str，那么上面的改写成c风格代码为：

for (int i = 0; i < 6; ++i) {
   int edx = (int)str[i] & 0xf; //相当于%rdx的低四字保存着str[i]的低四字，str[i]高四字其实是没有起什么作用的，那么答案肯定很多可能
   (%rsp + 0x10 + i) = (char)(0x4024b0 + edx);
}

写的有点丑陋了，说人话就是先将str[i]的低四字取出来，然后在栈%rsp + 0x10 + i指向的位置存入地址（0x4024b0 + str[i]的低四字处）处的字符，直到进行六次。然后将0移入%rsp + 0x16处（太挤了0就不写了），栈帧如下：
在这里插入图片描述

接下来继续看：

   0x00000000004010b3 <+81>:    mov    $0x40245e,%esi
   0x00000000004010b8 <+86>:    lea    0x10(%rsp),%rdi
   0x00000000004010bd <+91>:    call   0x401338 <strings_not_equal>
   0x00000000004010c2 <+96>:    test   %eax,%eax
   0x00000000004010c4 <+98>:    je     0x4010d9 <phase_5+119>
   0x00000000004010c6 <+100>:   call   0x40143a <explode_bomb>

这一段用到了phase_1出现的函数，通过这一段逻辑我们就能知道栈上的6个char和地址0x40245e的字符相同。

在这里插入图片描述

那么再回头看

在这里插入图片描述
我们就能知道六个字符取了偏移量为9，15，14，5，6，7处的字符，于是输入的str的六个字符的低四字分别为：1001，1111，1110，0101，0110，0111
为了在前面补上0110或者0100根据ASCII值表就能得到两个比较正常的答案（当然答案有很多很多种，我只是取了比较好写的两种）

ionefg
IONEFG

接下来就是跳转到结尾判断一下金丝雀值是否被破坏整个程序就结束啦！(^▽)

Phase_6

同样我们通过disas phase_6命令反汇编phase_6函数，得到的结果超级超级长，我省略掉保存与弹出被调用者保存寄存器的步骤贴出来如下：

   0x00000000004010fc <+8>:     sub    $0x50,%rsp
   0x0000000000401100 <+12>:    mov    %rsp,%r13
   0x0000000000401103 <+15>:    mov    %rsp,%rsi
   0x0000000000401106 <+18>:    call   0x40145c <read_six_numbers>
   0x000000000040110b <+23>:    mov    %rsp,%r14
   0x000000000040110e <+26>:    mov    $0x0,%r12d
   0x0000000000401114 <+32>:    mov    %r13,%rbp
   0x0000000000401117 <+35>:    mov    0x0(%r13),%eax
   0x000000000040111b <+39>:    sub    $0x1,%eax
   0x000000000040111e <+42>:    cmp    $0x5,%eax
   0x0000000000401121 <+45>:    jbe    0x401128 <phase_6+52>
   0x0000000000401123 <+47>:    call   0x40143a <explode_bomb>
   0x0000000000401128 <+52>:    add    $0x1,%r12d
   0x000000000040112c <+56>:    cmp    $0x6,%r12d
   0x0000000000401130 <+60>:    je     0x401153 <phase_6+95>
   0x0000000000401132 <+62>:    mov    %r12d,%ebx
   0x0000000000401135 <+65>:    movslq %ebx,%rax
   0x0000000000401138 <+68>:    mov    (%rsp,%rax,4),%eax
   0x000000000040113b <+71>:    cmp    %eax,0x0(%rbp)
   0x000000000040113e <+74>:    jne    0x401145 <phase_6+81>
   0x0000000000401140 <+76>:    call   0x40143a <explode_bomb>
   0x0000000000401145 <+81>:    add    $0x1,%ebx
   0x0000000000401148 <+84>:    cmp    $0x5,%ebx
   0x000000000040114b <+87>:    jle    0x401135 <phase_6+65>
   0x000000000040114d <+89>:    add    $0x4,%r13
   0x0000000000401151 <+93>:    jmp    0x401114 <phase_6+32>
   0x0000000000401153 <+95>:    lea    0x18(%rsp),%rsi
   0x0000000000401158 <+100>:   mov    %r14,%rax
   0x000000000040115b <+103>:   mov    $0x7,%ecx
   0x0000000000401160 <+108>:   mov    %ecx,%edx
   0x0000000000401162 <+110>:   sub    (%rax),%edx
   0x0000000000401164 <+112>:   mov    %edx,(%rax)
   0x0000000000401166 <+114>:   add    $0x4,%rax
   0x000000000040116a <+118>:   cmp    %rsi,%rax
   0x000000000040116d <+121>:   jne    0x401160 <phase_6+108>
   0x000000000040116f <+123>:   mov    $0x0,%esi
   0x0000000000401174 <+128>:   jmp    0x401197 <phase_6+163>
   0x0000000000401176 <+130>:   mov    0x8(%rdx),%rdx
   0x000000000040117a <+134>:   add    $0x1,%eax
   0x000000000040117d <+137>:   cmp    %ecx,%eax
   0x000000000040117f <+139>:   jne    0x401176 <phase_6+130>
   0x0000000000401181 <+141>:   jmp    0x401188 <phase_6+148>
   0x0000000000401183 <+143>:   mov    $0x6032d0,%edx
   0x0000000000401188 <+148>:   mov    %rdx,0x20(%rsp,%rsi,2)
   0x000000000040118d <+153>:   add    $0x4,%rsi
   0x0000000000401191 <+157>:   cmp    $0x18,%rsi
   0x0000000000401195 <+161>:   je     0x4011ab <phase_6+183>
   0x0000000000401197 <+163>:   mov    (%rsp,%rsi,1),%ecx
   0x000000000040119a <+166>:   cmp    $0x1,%ecx
   0x000000000040119d <+169>:   jle    0x401183 <phase_6+143>
   0x000000000040119f <+171>:   mov    $0x1,%eax
   0x00000000004011a4 <+176>:   mov    $0x6032d0,%edx
   0x00000000004011a9 <+181>:   jmp    0x401176 <phase_6+130>
   0x00000000004011ab <+183>:   mov    0x20(%rsp),%rbx
   0x00000000004011b0 <+188>:   lea    0x28(%rsp),%rax
   0x00000000004011b5 <+193>:   lea    0x50(%rsp),%rsi
   0x00000000004011ba <+198>:   mov    %rbx,%rcx
   0x00000000004011bd <+201>:   mov    (%rax),%rdx
   0x00000000004011c0 <+204>:   mov    %rdx,0x8(%rcx)
   0x00000000004011c4 <+208>:   add    $0x8,%rax
   0x00000000004011c8 <+212>:   cmp    %rsi,%rax
   0x00000000004011cb <+215>:   je     0x4011d2 <phase_6+222>
   0x00000000004011cd <+217>:   mov    %rdx,%rcx
   0x00000000004011d0 <+220>:   jmp    0x4011bd <phase_6+201>
   0x00000000004011d2 <+222>:   movq   $0x0,0x8(%rdx)
   0x00000000004011da <+230>:   mov    $0x5,%ebp
   0x00000000004011df <+235>:   mov    0x8(%rbx),%rax
   0x00000000004011e3 <+239>:   mov    (%rax),%eax
   0x00000000004011e5 <+241>:   cmp    %eax,(%rbx)
   0x00000000004011e7 <+243>:   jge    0x4011ee <phase_6+250>
   0x00000000004011e9 <+245>:   call   0x40143a <explode_bomb>
   0x00000000004011ee <+250>:   mov    0x8(%rbx),%rbx
   0x00000000004011f2 <+254>:   sub    $0x1,%ebp
   0x00000000004011f5 <+257>:   jne    0x4011df <phase_6+235>
   0x00000000004011f7 <+259>:   add    $0x50,%rsp

确实很长，但是不要被吓怕，所有的东西都是可以被一步一步拆解滴(╹▽╹)

最开始非常温和地使用了一个我们见过的函数read_six_numbers，回顾一下它的作用：把上面phase_4栈帧%rsp，%rsp+4，%rsp+8，…%rsp+0x14分别存在%rdx，%rcx, %r8, %r9（后四个参数传递寄存器）和’read_six_number’的栈帧顶上两个位置。
接下来使用<__isoc99_sscanf@plt>函数读取数据，第一个参数%rdi存放我们输入字符串，第二个参数%rsi保存地址$0x4025c3，用命令看一下里面的内容如下：
在这里插入图片描述
再根据后面的比较返回值cmp $0x5,%eax操作大致可以理解我们需要输入6个整型数据，分别将他们存在phase_4栈帧%rsp，%rsp+4，%rsp+8，…%rsp+0x14的位置。

   0x0000000000401100 <+12>:    mov    %rsp,%r13

   0x000000000040110e <+26>:    mov    $0x0,%r12d
   0x0000000000401114 <+32>:    mov    %r13,%rbp
   0x0000000000401117 <+35>:    mov    0x0(%r13),%eax
   0x000000000040111b <+39>:    sub    $0x1,%eax
   0x000000000040111e <+42>:    cmp    $0x5,%eax
   0x0000000000401121 <+45>:    jbe    0x401128 <phase_6+52>
   0x0000000000401123 <+47>:    call   0x40143a <explode_bomb>
   0x0000000000401128 <+52>:    add    $0x1,%r12d
   0x000000000040112c <+56>:    cmp    $0x6,%r12d
   0x0000000000401130 <+60>:    je     0x401153 <phase_6+95>
   0x0000000000401132 <+62>:    mov    %r12d,%ebx
   0x0000000000401135 <+65>:    movslq %ebx,%rax
   0x0000000000401138 <+68>:    mov    (%rsp,%rax,4),%eax
   0x000000000040113b <+71>:    cmp    %eax,0x0(%rbp)
   0x000000000040113e <+74>:    jne    0x401145 <phase_6+81>
   0x0000000000401140 <+76>:    call   0x40143a <explode_bomb>
   0x0000000000401145 <+81>:    add    $0x1,%ebx
   0x0000000000401148 <+84>:    cmp    $0x5,%ebx
   0x000000000040114b <+87>:    jle    0x401135 <phase_6+65>
   0x000000000040114d <+89>:    add    $0x4,%r13
   0x0000000000401151 <+93>:    jmp    0x401114 <phase_6+32>

先来看第一大段，仔细分析一下，将这段代码转化为c风格代码如下，假设我们输入的6个整数组成了一个数组为input[6]。

int index = 0
while (index != 6) {
   if (input[index] - 1 <= 5) goto L1；
   explode_bomb();
L1:
   index++;
   for (int i = index; i < 6; ++i) {
      if (input[index - 1] == input[i]) explode_bomb();
   }
}

第一段的if迫使(arg1 - 1)-(arg6 - 1)都小于等于5，又因为使用的是jbe所以针对无符号整数，因此arg1-arg6都是1-6之间的整数。
第二段的if是我们输入的这六个整数互不相同，因此我们输入的arg1-arg6是整数1-6的一个排列

   0x000000000040110b <+23>:    mov    %rsp,%r14

   0x0000000000401153 <+95>:    lea    0x18(%rsp),%rsi
   0x0000000000401158 <+100>:   mov    %r14,%rax
   0x000000000040115b <+103>:   mov    $0x7,%ecx
   0x0000000000401160 <+108>:   mov    %ecx,%edx
   0x0000000000401162 <+110>:   sub    (%rax),%edx
   0x0000000000401164 <+112>:   mov    %edx,(%rax)
   0x0000000000401166 <+114>:   add    $0x4,%rax
   0x000000000040116a <+118>:   cmp    %rsi,%rax
   0x000000000040116d <+121>:   jne    0x401160 <phase_6+108>

逻辑非常简单，其实也就是将7-arg作用于arg上再存入栈中，设现在栈上的元素设为newarg[1]-newarg[6]。接下来就到了重头戏，也是我个人认为这段代码最难理解的部分，下面调整一下汇编代码顺序让它符合跳转逻辑：

   0x000000000040116f <+123>:   mov    $0x0,%esi
   0x0000000000401174 <+128>:   jmp    0x401197 <phase_6+163>

   0x0000000000401197 <+163>:   mov    (%rsp,%rsi,1),%ecx
   0x000000000040119a <+166>:   cmp    $0x1,%ecx
   0x000000000040119d <+169>:   jle    0x401183 <phase_6+143>

for newarg[i] == 1：%rsi存放4i %ecx存放newarg[i]
---
   0x0000000000401183 <+143>:   mov    $0x6032d0,%edx
   0x0000000000401188 <+148>:   mov    %rdx,0x20(%rsp,%rsi,2) //将地址$0x6032d0存入栈上%rsp + 0x20 + (i - 1) * 8的位置
   0x000000000040118d <+153>:   add    $0x4,%rsi
   0x0000000000401191 <+157>:   cmp    $0x18,%rsi
   0x0000000000401195 <+161>:   je     0x4011ab <phase_6+183>
---

for newarg[i] > 1: %rsi存放4i %ecx存放newarg[i]
---
   0x000000000040119f <+171>:   mov    $0x1,%eax
   0x00000000004011a4 <+176>:   mov    $0x6032d0,%edx
   0x00000000004011a9 <+181>:   jmp    0x401176 <phase_6+130>

   0x0000000000401176 <+130>:   mov    0x8(%rdx),%rdx
   0x000000000040117a <+134>:   add    $0x1,%eax
   0x000000000040117d <+137>:   cmp    %ecx,%eax
   0x000000000040117f <+139>:   jne    0x401176 <phase_6+130>
   0x0000000000401181 <+141>:   jmp    0x401188 <phase_6+148>

   0x0000000000401188 <+148>:   mov    %rdx,0x20(%rsp,%rsi,2) //将地址$0x6032d0 + 8 * (newarg[i] - 1)存入栈上%rsp + 0x20 + (i - 1) * 8的位置
   0x000000000040118d <+153>:   add    $0x4,%rsi
   0x0000000000401191 <+157>:   cmp    $0x18,%rsi
   0x0000000000401195 <+161>:   je     0x4011ab <phase_6+183>
---

先来看一下地址0x6032d0里面到底存了什么东西：

在这里插入图片描述

仔细观察可以看出来这是一个单链表。那么这段代码实现了一个什么功能呢？
简单来说就是注释里写的这一段所实现的功能将地址$0x6032d0 + 8 * (newarg[i] - 1)存入栈上%rsp + 0x20 + i * 8的位置。
在这里插入图片描述

   0x00000000004011ab <+183>:   mov    0x20(%rsp),%rbx//%rbx是栈上存的内容但是也是地址
   0x00000000004011b0 <+188>:   lea    0x28(%rsp),%rax//%rax是栈上的地址
   0x00000000004011b5 <+193>:   lea    0x50(%rsp),%rsi//终止指针
   
   0x00000000004011ba <+198>:   mov    %rbx,%rcx

   0x00000000004011bd <+201>:   mov    (%rax),%rdx 
   0x00000000004011c0 <+204>:   mov    %rdx,0x8(%rcx) 
   0x00000000004011c4 <+208>:   add    $0x8,%rax
   0x00000000004011c8 <+212>:   cmp    %rsi,%rax
   0x00000000004011cb <+215>:   je     0x4011d2 <phase_6+222>
   0x00000000004011cd <+217>:   mov    %rdx,%rcx
   0x00000000004011d0 <+220>:   jmp    0x4011bd <phase_6+201>

   0x00000000004011d2 <+222>:   movq   $0x0,0x8(%rdx)

这一块整理成C风格代码如下：

for (int i = 1; i < 6; ++i) {
   node[newarg[i]].next = &node[newarg[i + 1]];
}
node[6].next = nullptr;

终于要到最后一块了！

   0x00000000004011ab <+183>:   mov    0x20(%rsp),%rbx

   0x00000000004011da <+230>:   mov    $0x5,%ebp
   0x00000000004011df <+235>:   mov    0x8(%rbx),%rax
   0x00000000004011e3 <+239>:   mov    (%rax),%eax
   0x00000000004011e5 <+241>:   cmp    %eax,(%rbx)
   0x00000000004011e7 <+243>:   jge    0x4011ee <phase_6+250>
   0x00000000004011e9 <+245>:   call   0x40143a <explode_bomb>
   0x00000000004011ee <+250>:   mov    0x8(%rbx),%rbx
   0x00000000004011f2 <+254>:   sub    $0x1,%ebp
   0x00000000004011f5 <+257>:   jne    0x4011df <phase_6+235>

由这一块逻辑可以得到在上面画的栈上由下到上所表示的node的值降序排列，通过node.val如下进行排序（注意仅对低四字节进行排序）：
在这里插入图片描述

node3.val > node4.val > node5.val > node6.val > node1.val > node2.val

那么在栈上的对应关系如图所示：

在这里插入图片描述

那么newarg1-newarg6分别为 3 4 5 6 1 2

arg1-arg6分别为4 3 2 1 6 5，这就是phase_6最终的答案。

嘿嘿完结撒花！✿✿ヽ(°▽°)ノ✿

但是真的只是这样吗？回顾一下bomb.c文件，在破解phase_6之后：

    input = read_line();
    phase_6(input);
    phase_defused();

    /* Wow, they got it!  But isn't something... missing?  Perhaps
     * something they overlooked?  Mua ha ha ha ha! */

所以遗漏了什么呢？这个phase_defused函数好像从来没有反汇编去了解过，那么到此我们才算进入最后一个阶段。

Secret_phase

反汇编得到如下代码：

在这里插入图片描述

当我们查看0x402619的内容时发现：
在这里插入图片描述

这说明我们的输入是两个整数+一个字符串，并且分析可知分别存在了%rsp +0x8，%rsp + 0xc，%rsp + 0x10的位置，字符串的信息通过分析显然很清晰长度为6，通过查看0x402622的内容：
在这里插入图片描述

现在字符串已经很清晰了，就剩下两个整数，可当我们去查看0x603870的内容会发现他为空，猜测这个是因为这个地址在未运行时是没有放入东西的，放入的东西是在前几个phase运行产生的，那么我们就在call 0x400bf0 <__isoc99_sscanf@plt>前面设置一个断点让其运行到此然后再查看该内存地址处的内容。

在这里插入图片描述

可以神奇地发现这个就是我们phase_4求出来的答案，因此可以知道我们要进入secret_phase的话就要在phase_4的答案后面加上一个字符串DrEvil，然后我们就可以在最后处理secret_phase了！phased_defused剩下的没什么重要的，我们直接进入secret_phase的反汇编代码：

在这里插入图片描述

这个函数的主干其实是fun7函数，我们简要介绍一下总体的逻辑：

从stdin或文件读入一行字符然后将其转化为long类型并且限制器范围大于0小于等于0x3e9 = 1001
然后调用fun7函数，第一个参数为地址0x6030f0，第二个参数为第一步得到的long类型整数
当fun7函数返回值为2时我们正常结束函数，否则炸弹爆炸

fun7函数如下：

在这里插入图片描述

重点其实就是我们传进去的那个地址，查看一下里面的内容：

在这里插入图片描述

观察一下可以发现这其实是一个二叉树，画图如下：

在这里插入图片描述

那么将fun7的逻辑转成c风格代码：

struct node {
   long val;
   node *left;
   node *right;
};

int fun7(node *head, long val) {
   if (head->val == val) {
      return 0;
   } else if (head->val < val) {
      return 2 * fun7(head->right, val) + 1;
   } else {
      return 2 * fun7(head->left, val);
   }
}

结合二叉树和代码可知只有22可以使得返回值为2，所以secret_phase最后的答案就是22。

最后总体来运行一下，ans.txt的答案如下（多个答案的选了一部分写）：

在这里插入图片描述

最后结果：

在这里插入图片描述

Summary

最后来一点碎碎念，终于终于写完了，这个Lab从上周五就开始写，本来把第三章又过了一遍自信满满地以为一天就可以搞定，结果因为中间写的很暴躁已经各种汇报啊朋友来参观啊之类的事情，拖到今天才写完+整理完，总体来说前几个phase和secret_phase基本上是水到渠成，但是phase_6卡了好久好久，中间node和参数的对应真的搞得人头大，但好在最后也是搞完了，看到汇编代码不再头疼了而且也学会了一些gdb调试的技巧，我可真棒呀，继续加油ヾ(◍°∇°◍)ﾉﾞ

OrangeEarth

关注

58
点赞
踩
48

收藏

觉得还不错? 一键收藏
1
评论
CSAPP | Lab2-Bomb Lab详细解析

邪恶的邪恶博士在我们班的机器上安放了大量的“二元炸弹”。二进制炸弹是一个由一系列阶段组成的程序。每个阶段都要求你在 stdin 上键入一个特定的字符串。如果你键入了正确的字符串，那么这个阶段就会被拆除，炸弹就会进入下一个阶段。否则，炸弹会通过打印“ BOOM 然后终止。当每个阶段都被拆除时，炸弹就会被拆除。我们要处理的炸弹太多了，所以我们给每个学生一个炸弹去拆除。你们的任务，你们别无选择，只能接受，就是在due date前拆除炸弹。祝你好运，欢迎加入拆弹小组！
复制链接

扫一扫