源码下载:github
一、RetDec简介
RetDec is a retargetable machine-code decompiler based on LLVM.
The decompiler is not limited to any particular target architecture, operating system, or executable file format:
- Supported file formats: ELF, PE, Mach-O, COFF, AR (archive), Intel HEX, and raw machine code
- Supported architectures:
- 32-bit: Intel x86, ARM, MIPS, PIC32, and PowerPC
- 64-bit: x86-64, ARM64 (AArch64)
Features:
- Static analysis of executable files with detailed information.
- Compiler and packer detection.
- Loading and instruction decoding.
- Signature-based removal of statically linked library code.
- Extraction and utilization of debugging information (DWARF, PDB).
- Reconstruction of instruction idioms.
- Detection and reconstruction of C++ class hierarchies (RTTI, vtables).
- Demangling of symbols from C++ binaries (GCC, MSVC, Borland).
- Reconstruction of functions, types, and high-level constructs.
- Integrated disassembler.
- Output in two high-level languages: C and a Python-like language.
- Generation of call graphs, control-flow graphs, and various statistics.
二、 RetDec安装(Windows)
Requirements 依赖:
Windows
- Microsoft Visual C++ (version >= Visual Studio 2017 version 15.7)
- CMake (version >= 3.6) 安装教程
- Git
- OpenSSL (version >= 1.1.1)
- Python (version >= 3.4)
- Optional: Doxygen and Graphviz for generating API documentation
Note: Although RetDec now supports a system-wide installation, unless you use your distribution’s package manager to install it, we recommend installing RetDec locally into a designated directory. The reason for this is that uninstallation will be easier as you will only need to remove a single directory.
尽管RetDec现在支持系统范围的安装,但除非使用发行版的包管理器来安装它,否则我们建议将RetDec本地安装到指定的目录中。这样做的原因是,卸载将更容易,因为您只需要删除一个目录。
To perform a local installation, run cmake
with the -DCMAKE_INSTALL_PREFIX=<path>
parameter, where <path>
is directory into which RetDec will be installed (e.g. $HOME/projects/retdec-install
on Linux and macOS, and C:\projects\retdec-install
on Windows).
尝试自己编译一下RetDec,失败了,直接使用之前无需自己编译的RetDec版本。
选择了最新的 Release
版本
Releases v4.0
RetDec使用(Windows)
README.md
中是使用命令行对二进制进行反汇编。
To decompile a binary file named test.exe
, run the following command (ensure that python
runs Python 3; as an alternative, you can try py -3
)
python $RETDEC_INSTALL_DIR/bin/retdec-decompiler.py test.exe
For more information, run retdec-decompiler.py
with --help
.
RETDEC_INSTALL_DIR
是RetDec V4.0
的解压路径
文件夹中包含 bin
, include
,lib
,share
四个文件夹。
retdec-decompiler.py
在文件夹bin
中
我使用项目Coreutils
编译生成的二进制[
作为示例,成功了.
python $RETDEC_INSTALL_DIR/bin/retdec-decompiler.py test.exe
但是我想要使用pycharm对二进制进行反汇编。
打开 .../bin/retdec-decompiler.py
if __name__ == '__main__':
decompiler = Decompiler(sys.argv[1:])
sys.exit(decompiler.decompile())
发现是将参数传入Decompiler()
,
Elements of sys.argv is a Python list, all command-line arguments are passed into it in the correct order.
sys.argv参考
可以看到sys.argv
是一个list
,
所以我该段代码修改为:
if __name__ == '__main__':
Binary = [r".../[", "--output=...//Output//["]
decompiler = Decompiler(Binary)
sys.exit(decompiler.decompile())
运行后,反汇编过程中的信息存放在五个文件中,在设置的 Output
文件夹中可以看到。
控制流图 调用流图生成
Requirement:Graphviz
参考
python retdec-decompiler.py --backend-emit-cfg --backend-emit-cg file.exe
控制流图 --backend-emit-cfg
调用流图 --backend-emit-cg
Retdec V5.0
To decompile a binary file named
test.exe
, run
$RETDEC_INSTALL_DIR\bin\retdec-decompiler.exe test.exe
For more information, runretdec-decompiler.exe
with--help
.
--help
Mandatory arguments:
INPUT_FILE File to decompile.
General arguments:
[-o|–output FILE] Output file (default: INPUT_FILE.c if OUTPUT_FORMAT is plain, INPUT_FILE.c.json if OUTPUT_FORMAT is json|json-human).
[-s|–silent] Turns off informative output of the decompilation.
[-f|–output-format OUTPUT_FORMAT] Output format [plain|json|json-human] (default: plain).
[-m|–mode MODE] Force the type of decompilation mode [bin|raw] (default: bin).
[-p|–pdb FILE] File with PDB debug information.
[-k|–keep-unreachable-funcs] Keep functions that are unreachable from the main function.
[–cleanup] Removes temporary files created during the decompilation.
[–config] Specify JSON decompilation configuration file.
[–disable-static-code-detection] Prevents detection of statically linked code.
Selective decompilation arguments:
[–select-ranges RANGES] Specify a comma separated list of ranges to decompile (example: 0x100-0x200,0x300-0x400,0x500-0x600).
[–select-functions FUNCS] Specify a comma separated list of functions to decompile (example: fnc1,fnc2,fnc3).
[–select-decode-only] Decode only selected parts (functions/ranges). Faster decompilation, but worse results.
Raw or Intel HEX decompilation arguments:
[-a|–arch ARCH] Specify target architecture [mips|pic32|arm|thumb|arm64|powerpc|x86|x86-64].
Required if it cannot be autodetected from the input (e.g. raw mode, Intel HEX).
[-e|–endian ENDIAN] Specify target endianness [little|big].
Required if it cannot be autodetected from the input (e.g. raw mode, Intel HEX).
[-b|–bit-size SIZE] Specify target bit size [16|32|64] (default: 32).
Required if it cannot be autodetected from the input (e.g. raw mode).
[–raw-section-vma ADDRESS] Virtual address where section created from the raw binary will be placed.
[–raw-entry-point ADDRESS] Entry point address used for raw binary (default: architecture dependent).
Archive decompilation arguments:
[–ar-index INDEX] Pick file from archive for decompilation by its zero-based index.
[–ar-name NAME] Pick file from archive for decompilation by its name.
[–static-code-sigfile FILE] Adds additional signature file for static code detection.
Backend arguments:
[–backend-disabled-opts LIST] Prevents the optimizations from the given comma-separated list of optimizations to be run.
[–backend-enabled-opts LIST] Runs only the optimizations from the given comma-separated list of optimizations.
[–backend-call-info-obtainer NAME] Name of the obtainer of information about function calls [optim|pessim] (Default: optim).
[–backend-var-renamer STYLE] Used renamer of variables [address|hungarian|readable|simple|unified] (Default: readable).
[–backend-no-opts] Disables backend optimizations.
[–backend-emit-cfg] Emits a CFG for each function in the backend IR (in the .dot format).
[–backend-emit-cg] Emits a CG for the decompiled module in the backend IR (in the .dot format).
[–backend-keep-all-brackets] Keeps all brackets in the generated code.
[–backend-keep-library-funcs] Keep functions from standard libraries.
[–backend-no-time-varying-info] Do not emit time-varying information, like dates.
[–backend-no-var-renaming] Disables renaming of variables in the backend.
[–backend-no-compound-operators] Do not emit compound operators (like +=) instead of assignments.
[–backend-no-symbolic-names] Disables the conversion of constant arguments to their symbolic names.
Decompilation process arguments:
[–timeout SECONDS]
[–max-memory MAX_MEMORY] Limits the maximal memory used by the given number of bytes.
[–no-memory-limit] Disables the default memory limit (half of system RAM).
LLVM IR debug arguments:
[–print-after-all] Dump LLVM IR to stderr after every LLVM pass.
[–print-before-all] Dump LLVM IR to stderr before every LLVM pass.
Other arguments:
[-h|–help] Show this help.
[–version] Show RetDec version.