逆向工程之还原源代码演示

原创 2006年05月24日 09:03:00

Basic binary reconstruction from assembler
          by c0ntex | c0ntexb@gmail.com      
------------------------------------------

This paper will provide a quick overview of how to perform reverse engineering against a simple .exe binary.
By using the assembly of a binary, it can be trivial to gain a basic understanding of the executable, which
will allow for the source code to be retrieved almost exactly as the developer designed it.

In this example we are only using a small program and as such it is easy to do, on a larger exe it would take
much longer and a more rigerous review of the assembler would be required.


// IDA assembler dump
; seh.exe
.text:0040102B                 push    ebp
.text:0040102C                 mov     ebp, esp  
.text:0040102E                 sub     esp, 0Ch
.text:00401031                 cmp     [ebp+argc], 2
.text:00401035                 jz      short loc_40104B
.text:00401037                 push    offset aUsageSeh_exeBu ; "Usage: seh.exe <buffer>/n"
.text:0040103C                 call    _printf
.text:00401041                 add     esp, 4
.text:00401044                 push    1               ; int
.text:00401046                 call    _exit

.text:0040104B loc_40104B:                             ; CODE XREF: _main+Aj
.text:0040104B                 lea     eax, [ebp+var_C] 
.text:0040104E                 push    eax             ; char * 
.text:0040104F                 mov     ecx, [ebp+argv] 
.text:00401052                 push    ecx             ; int 
.text:00401053                 call    sub_401000 
.text:00401058                 add     esp, 8 
.text:0040105B                 xor     eax, eax 
.text:0040105D                 mov     esp, ebp
.text:0040105F                 pop     ebp 
.text:00401060                 retn 
.text:00401060 _main           endp

.text:00401000 sub_401000      proc near               ; CODE XREF: _main+284p
.text:00401000
.text:00401000 arg_0           = dword ptr  8
.text:00401000 arg_4           = dword ptr  0Ch
.text:00401000
.text:00401000                 push    ebp
.text:00401001                 mov     ebp, esp
.text:00401003                 mov     eax, [ebp+arg_0]
.text:00401006                 mov     ecx, [eax+4]
.text:00401009                 push    ecx             ; char *
.text:0040100A                 mov     edx, [ebp+arg_4]
.text:0040100D                 push    edx             ; char *
.text:0040100E                 call    _strcpy
.text:00401013                 add     esp, 8
.text:00401016                 mov     eax, [ebp+arg_4]
.text:00401019                 push    eax
.text:0040101A                 push    offset aMySehf00IsBett ; "/nMy sehf00 is better than your sehf00 -"...
.text:0040101F                 call    _printf
.text:00401024                 add     esp, 8
.text:00401027                 xor     eax, eax
.text:00401029                 pop     ebp
.text:0040102A                 retn
.text:0040102A sub_401000      endp


So from the above assembler, we can start to replay the instructions into the equivilent c language and have
a fairly good, though not exact, representation of the c *in this case* code used to build the executable.

If there is a piece of code that you are unsure what it's function is, you can write a test c file and run
it through IDA to verify the instructions against what you thought / expected to see.

Dumping the first function only to show how it is done:


push    ebp       ;Back up original stack pointer
mov     ebp, esp      ;            procedure prologue
sub     esp, 0Ch      ;Allocate 12 bytes of space
cmp     [ebp+argc], 2      ;Verify there are 2 arguments passed
jz      short loc_40104B     ;if there are, jump to loc_40104B
push    offset aUsageSeh_exeBu ; "Usage: seh.exe <buffer>/n" ;if not, push error message on stack
call    _printf       ;print the error message
add     esp, 4       ;add 4 to esp
push    1               ; int     ;push exit value
call    _exit       ;exit


probably giving us:


int main(int argc, char **argv)
{
 if(argc != 2) {
   printf("Usage: seh.exe <buffer>/n");
   _exit(1);
  }
something()
}


Do this with each function until we have sourced the entire image. Performing this type of resolution on all
the assembler will provide something like the following:


int main(int argc, char **argv)
{
        char varc[12];
        if(argc != 2) {
                printf("Usage: seh.exe <buffer>/n");
                _exit(1);
        }
        locfunc(argv, varc);
        return(0);
}

locfunc(char **argv, char *varc);
        strcpy(varc, argv[1]);
        printf("My sehf00 is better than your sehf00/n");
        return(0);
}

As you can see we have a fairly complete piece of code, the actual program below is the initial code used to
compile seh.exe and it is obvious that we were real close to the correct syntax. This just shows that even in
a situation when you do not have the applications source to hand, it is still fairly trivial to make by using
a dissassembler to see exactly how the program fits together.

This can be an important skill to have if you want to examine or modify an executable in some manner when you
do not have source code. Say that this application was a piece of malware or a worm that you found on your
system. It would be useful to understand how it worked and perhaps how it got there. By reverse engineering
it, functionality can become apparent straight away and allow you do determine how the program works and what
it does.

// Original seh.exe source
#include<stdio.h>
#include<string.h>
#include<windows.h>

int blah(char *argv[], char *sehheh)
{
        strcpy(sehheh, argv[1]);
 
        printf("/nMy sehf00 is better than your sehf00 -> [%s]/n", sehheh);

        return(0);
}

int main(int argc, char *argv[]){

        char sehheh[12];

        if (argc != 2){
                printf("Usage: myseh.exe <buffer>/n");
                exit(1);
        }

        blah(argv, sehheh);

        return(0);
}

I hope this short paper has been useful in showing how easy it is to perform some basic reverse engineering
of a compiled executable.

“逆向一个非常有意思的小程序”的非汇编解读

基于C++机制对一段小程序的解读
  • caterpillarous
  • caterpillarous
  • 2015年11月21日 13:50
  • 541

MyBatis 逆向工程生产源码(po、mapper)

什么是mybatis的逆向工程 mybatis官方为了提高开发效率,提高自动对单表生成sql,包括 :mapper.xml、mapper.java、表名.java(po类)在企业开发中通常是在设计阶段...
  • bug_moving
  • bug_moving
  • 2016年12月30日 18:09
  • 1913

Android官方DataBinding(十二):双向绑定之反向绑定的InverseBindingMethods改造和实现

Android官方DataBinding(十二):双向绑定之反向绑定的InverseBindingMethods改造和实现在附录文章十、十一的基础上,使用InverseBindingMethod进行双...
  • zhangphil
  • zhangphil
  • 2017年09月01日 16:02
  • 469

一个复杂系统的拆分改造实践

1 为什么要拆分? 先看一段对话。 从上面对话可以看出拆分的理由: 1)  应用间耦合严重。系统内各个应用之间不通,同样一个功能在各个应用中都有实现,后果就是改一处功能,需要...
  • a494303877
  • a494303877
  • 2017年01月03日 17:59
  • 516

Android逆向分析惯用网站

Android逆向分析常用网站 androidterm:    Android Terminal Emulator    http://code.google.com/p/androidterm...
  • earbao
  • earbao
  • 2014年01月20日 16:11
  • 2021

Java逆向工程师

职位职能:   高级软件工程师  软件工程师 职位描述: 1.精通Java,有Java应用程序开发经验; 2.具有一定的Java逆向和破解基础,能够对被Allatori等Obfuscat...
  • liangyixin19800304
  • liangyixin19800304
  • 2013年10月12日 21:15
  • 1101

Blog里的一些公告代码

http://www.clocklink.com/Clocks/0001-Blue.swf?TimeZone=CCT"  width="160" height="160" wmode="transpa...
  • soft_ice
  • soft_ice
  • 2005年11月18日 23:35
  • 1341

改造 Cydia Substrate 框架用于函数内代码的HOOK

上一次分析了Cydia Substrate so hook 框架的实现,实际使用中,发现这样的框架并不能满足我的一些需求,比如我要知道一个函数内部某处代码的运行时的寄存器值,用原始的框架就无法做到。 ...
  • justFWD
  • justFWD
  • 2014年12月24日 13:36
  • 2723

【iOS逆向工程】从脱壳到获取源码

脱壳,获取源码.h文件,获取关心的伪代码
  • weixin_38327562
  • weixin_38327562
  • 2017年10月26日 16:51
  • 185

linux各种常用源码网站

busybox的源码:https://busybox.net/downloads/ ubuntu的下载网站:http://www.ubuntu.org.cn/download/ubuntu-kylin...
  • u014213012
  • u014213012
  • 2016年09月24日 10:36
  • 961
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:逆向工程之还原源代码演示
举报原因:
原因补充:

(最多只允许输入30个字)