逆向pyinstaller打包的exe程序获取源代码

由于 Python 程序使用简单,并且拥有丰富的第三方库,有很多人会用 Python 程序编写小工具或者恶意文件,偶尔会遇到需要对 Python 编译打包的 exe 程序进行逆向的需求,特此记录一下逆向过程

Pyinstaller

pyinstaller 是一个用于将 Python 程序打包成独立可执行文件的工具,能够在没有 Python 解释器的情况下运行。这意味着你可以将 Python 脚本转换为 Windows、macOS 和 Linux 操作系统上的可执行文件,使得分发和共享你的应用程序更加方便

pyinstaller 可以通过 pip 程序安装

pip install pyinstaller

使用很简单,通过一条指令便能快速打包 python 程序

pyinstaller your_script.py

打包成单文件并指定图标和数据:

pyinstaller -Fw --icon=h.ico your_script.py --add-data="h.ico;/"

打包成文件夹并指定图标和数据:

pyinstaller -w --icon=h.ico your_script.py --add-data="h.ico;."

打包后会在 dist 目录中找到生成的可执行文件,直接运行即可

Exeinfo PE

Exeinfo PE 是一款查看 PE 文件信息的工具,可以查看 EXE/dll 文件的编译器信息、是否加壳、入口点地址、输出表/输入表等等 PE 信息,帮助开发人员对程序进行分析和逆向。Exeinfo PE 还可以提取PE文件中的资源,可以提取图片、EXE、压缩包、MSI、SWF等等资源

使用类似的查壳工具可以查看程序是不是 pyinstaller 打包,确定是 pyinstaller 打包后就可以使用下面的方法进行反编译了

image.png

exe反编译

pyinstxtractor.py

仓库地址:https://github.com/countercept/python-exe-unpacker

拉取仓库源代码到本地

git clone https://github.com/countercept/python-exe-unpacker

使用 Python 程序执行pyinstxtractor.py脚本,便可以对指定的 exe 程序进行反编译,反编译时可以看到程序打包时使用的 Python 版本

python pyinstxtractor.py <exe程序>

image.png

编译完成后同目录下会出现一个xxx.exe_extracted的目录,进入目录后可以看到同 exe 程序名的一个文件。由于我们反编译的 Python 版本与程序 Python 版本不同,导致反编译出来的程序没有携带 .pyc后缀,需要我们自己手动加上后缀,如下图修改成kuang.pyc

image.png
由于 pyinstaller 打包后会去掉文件头部的 magic number,此时反编译出来的 pyc 文件都是缺少 magic number 值的,会导致后面我们将 pyc 文件反编译为 py 文件时出现问题,需要我们后面手动补上缺失的部分

pyc反编译

Uncompyle6

通过上一步得到字节码 pyc 文件后,下一步就是将其反编译为 py 源代码文件,这里用到 uncompyle6

使用前需要安装

pip install uncompyle6

安装后有两种方式对 pyc 文件进行反编译,确保二进制文件的扩展名为.pyc,如果不是该文件类型uncompyle6运行会报错

uncompyle6 -o result.py target.pyc
uncompyle6 target.pyc > result.py

反编译后打开.py文件,如果提示类似Unknown magic number 227 in target.pyc的告警,是由于 pyc 文件生成时,头部的 magic number 被清理,需要另外补上

截止目前已有的 magic number 有下面这些

Python 1.5: 20121
Python 1.5.1: 20121
Python 1.5.2: 20121
Python 1.6: 50428
Python 2.0: 50823
Python 2.0.1: 50823
Python 2.1: 60202
Python 2.1.1: 60202
Python 2.1.2: 60202
Python 2.2: 60717
Python 2.3a0: 62011
Python 2.3a0: 62021
Python 2.3a0: 62011 (!)
Python 2.4a0: 62041
Python 2.4a3: 62051
Python 2.4b1: 62061
Python 2.5a0: 62071
Python 2.5a0: 62081 (ast-branch)
Python 2.5a0: 62091 (with)
Python 2.5a0: 62092 (changed WITH_CLEANUP opcode)
Python 2.5b3: 62101 (fix wrong code: for x, in …)
Python 2.5b3: 62111 (fix wrong code: x += yield)
Python 2.5c1: 62121 (fix wrong lnotab with for loops and storing constants that should have been removed)
Python 2.5c2: 62131 (fix wrong code: for x, in … in listcomp/genexp)
Python 2.6a0: 62151 (peephole optimizations and STORE_MAP opcode)
Python 2.6a1: 62161 (WITH_CLEANUP optimization)
Python 2.7a0: 62171 (optimize list comprehensions/change LIST_APPEND)
Python 2.7a0: 62181 (optimize conditional branches: introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
Python 2.7a0: 62191 (introduce SETUP_WITH)
Python 2.7a0: 62201 (introduce BUILD_SET)
Python 2.7a0: 62211 (introduce MAP_ADD and SET_ADD)
Python 3000: 3000
    3010 (removed UNARY_CONVERT)
    3020 (added BUILD_SET)
    3030 (added keyword-only parameters)
    3040 (added signature annotations)
    3050 (print becomes a function)
    3060 (PEP 3115 metaclass syntax)
    3061 (string literals become unicode)
    3071 (PEP 3109 raise changes)
    3081 (PEP 3137 make file and name unicode)
    3091 (kill str8 interning)
    3101 (merge from 2.6a0, see 62151)
    3103 (file points to source file)
Python 3.0a4: 3111 (WITH_CLEANUP optimization).
Python 3.0b1: 3131 (lexical exception stacking, including POP_EXCEPT 3021)
Python 3.1a1: 3141 (optimize list, set and dict comprehensions: change LIST_APPEND and SET_ADD, add MAP_ADD #2183)
Python 3.1a1: 3151 (optimize conditional branches: introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE 4715)
Python 3.2a1: 3160 (add SETUP_WITH #6101)
tag: cpython-32
Python 3.2a2: 3170 (add DUP_TOP_TWO, remove DUP_TOPX and ROT_FOUR #9225)
tag: cpython-32
Python 3.2a3: 3180 (add DELETE_DEREF #4617)
Python 3.3a1: 3190 (class super closure changed)
Python 3.3a1: 3200 (PEP 3155 qualname added #13448)
Python 3.3a1: 3210 (added size modulo 2**32 to the pyc header #13645)
Python 3.3a2: 3220 (changed PEP 380 implementation #14230)
Python 3.3a4: 3230 (revert changes to implicit class closure #14857)
Python 3.4a1: 3250 (evaluate positional default arguments before keyword-only defaults #16967)
Python 3.4a1: 3260 (add LOAD_CLASSDEREF; allow locals of class to override free vars #17853)
Python 3.4a1: 3270 (various tweaks to the class closure #12370)
Python 3.4a1: 3280 (remove implicit class argument)
Python 3.4a4: 3290 (changes to qualname computation #19301)
Python 3.4a4: 3300 (more changes to qualname computation #19301)
Python 3.4rc2: 3310 (alter qualname computation #20625)
Python 3.5a1: 3320 (PEP 465: Matrix multiplication operator #21176)
Python 3.5b1: 3330 (PEP 448: Additional Unpacking Generalizations #2292)
Python 3.5b2: 3340 (fix dictionary display evaluation order #11205)
Python 3.5b3: 3350 (add GET_YIELD_FROM_ITER opcode #24400)
Python 3.5.2: 3351 (fix BUILD_MAP_UNPACK_WITH_CALL opcode #27286)
Python 3.6a0: 3360 (add FORMAT_VALUE opcode #25483)
Python 3.6a1: 3361 (lineno delta of code.co_lnotab becomes signed #26107)
Python 3.6a2: 3370 (16 bit wordcode #26647)
Python 3.6a2: 3371 (add BUILD_CONST_KEY_MAP opcode #27140)
Python 3.6a2: 3372 (MAKE_FUNCTION simplification, remove MAKE_CLOSURE #27095)
Python 3.6b1: 3373 (add BUILD_STRING opcode #27078)
Python 3.6b1: 3375 (add SETUP_ANNOTATIONS and STORE_ANNOTATION opcodes #27985)
Python 3.6b1: 3376 (simplify CALL_FUNCTIONs & BUILD_MAP_UNPACK_WITH_CALL 27213)
Python 3.6b1: 3377 (set class cell from type.new #23722)
Python 3.6b2: 3378 (add BUILD_TUPLE_UNPACK_WITH_CALL #28257)
Python 3.6rc1: 3379 (more thorough class validation #23722)
Python 3.7a1: 3390 (add LOAD_METHOD and CALL_METHOD opcodes #26110)
Python 3.7a2: 3391 (update GET_AITER #31709)
Python 3.7a4: 3392 (PEP 552: Deterministic pycs #31650)
Python 3.7b1: 3393 (remove STORE_ANNOTATION opcode #32550)
Python 3.7b5: 3394 (restored docstring as the first stmt in the body; this might affected the first line number #32911)
Python 3.8a1: 3400 (move frame block handling to compiler #17611)
Python 3.8a1: 3401 (add END_ASYNC_FOR #33041)
Python 3.8a1: 3410 (PEP570 Python Positional-Only Parameters #36540)
Python 3.8b2: 3411 (Reverse evaluation order of key: value in dict comprehensions #35224)
Python 3.8b2: 3412 (Swap the position of positional args and positional only args in ast.arguments #37593)
Python 3.8b4: 3413 (Fix “break” and “continue” in “finally” #37830)
Python 3.9a0: 3420 (add LOAD_ASSERTION_ERROR #34880)
Python 3.9a0: 3421 (simplified bytecode for with blocks #32949)
Python 3.9a0: 3422 (remove BEGIN_FINALLY, END_FINALLY, CALL_FINALLY, POP_FINALLY bytecodes #33387)
Python 3.9a2: 3423 (add IS_OP, CONTAINS_OP and JUMP_IF_NOT_EXC_MATCH bytecodes #39156)
Python 3.9a2: 3424 (simplify bytecodes for value unpacking)
Python 3.9a2: 3425 (simplify bytecodes for **value unpacking)
Python 3.10a1: 3430 (Make ‘annotations’ future by default)
Python 3.10a1: 3431 (New line number table format – PEP 626)
Python 3.10a2: 3432 (Function annotation for MAKE_FUNCTION is changed from dict to tuple bpo-42202)
Python 3.10a2: 3433 (RERAISE restores f_lasti if oparg != 0)
Python 3.10a6: 3434 (PEP 634: Structural Pattern Matching)
Python 3.10a7: 3435 Use instruction offsets (as opposed to byte offsets).
Python 3.10b1: 3436 (Add GEN_START bytecode #43683)
Python 3.10b1: 3437 (Undo making ‘annotations’ future by default - We like to dance among core devs!)
Python 3.10b1: 3438 Safer line number table handling.
Python 3.10b1: 3439 (Add ROT_N)
Python 3.11a1: 3450 Use exception table for unwinding (“zero cost” exception handling)
Python 3.11a1: 3451 (Add CALL_METHOD_KW)
Python 3.11a1: 3452 (drop nlocals from marshaled code objects)
Python 3.11a1: 3453 (add co_fastlocalnames and co_fastlocalkinds)
Python 3.11a1: 3454 (compute cell offsets relative to locals bpo-43693)
Python 3.11a1: 3455 (add MAKE_CELL bpo-43693)
Python 3.11a1: 3456 (interleave cell args bpo-43693)
Python 3.11a1: 3457 (Change localsplus to a bytes object bpo-43693)
Python 3.11a1: 3458 (imported objects now don’t use LOAD_METHOD/CALL_METHOD)
Python 3.11a1: 3459 (PEP 657: add end line numbers and column offsets for instructions)
Python 3.11a1: 3460 (Add co_qualname field to PyCodeObject bpo-44530)
Python 3.11a1: 3461 (JUMP_ABSOLUTE must jump backwards)
Python 3.11a2: 3462 (bpo-44511: remove COPY_DICT_WITHOUT_KEYS, change MATCH_CLASS and MATCH_KEYS, and add COPY)
Python 3.11a3: 3463 (bpo-45711: JUMP_IF_NOT_EXC_MATCH no longer pops the active exception)
Python 3.11a3: 3464 (bpo-45636: Merge numeric BINARY_/INPLACE_* into BINARY_OP)
Python 3.11a3: 3465 (Add COPY_FREE_VARS opcode)
Python 3.11a4: 3466 (bpo-45292: PEP-654 except*)
Python 3.11a4: 3467 (Change CALL_xxx opcodes)
Python 3.11a4: 3468 (Add SEND opcode)
Python 3.11a4: 3469 (bpo-45711: remove type, traceback from exc_info)
Python 3.11a4: 3470 (bpo-46221: PREP_RERAISE_STAR no longer pushes lasti)
Python 3.11a4: 3471 (bpo-46202: remove pop POP_EXCEPT_AND_RERAISE)
Python 3.11a4: 3472 (bpo-46009: replace GEN_START with POP_TOP)
Python 3.11a4: 3473 (Add POP_JUMP_IF_NOT_NONE/POP_JUMP_IF_NONE opcodes)
Python 3.11a4: 3474 (Add RESUME opcode)
Python 3.11a5: 3475 (Add RETURN_GENERATOR opcode)
Python 3.11a5: 3476 (Add ASYNC_GEN_WRAP opcode)
Python 3.11a5: 3477 (Replace DUP_TOP/DUP_TOP_TWO with COPY and ROT_TWO/ROT_THREE/ROT_FOUR/ROT_N with SWAP)
Python 3.11a5: 3478 (New CALL opcodes)
Python 3.11a5: 3479 (Add PUSH_NULL opcode)
Python 3.11a5: 3480 (New CALL opcodes, second iteration)
Python 3.11a5: 3481 (Use inline cache for BINARY_OP)
Python 3.11a5: 3482 (Use inline caching for UNPACK_SEQUENCE and LOAD_GLOBAL)
Python 3.11a5: 3483 (Use inline caching for COMPARE_OP and BINARY_SUBSCR)
Python 3.11a5: 3484 (Use inline caching for LOAD_ATTR, LOAD_METHOD, and STORE_ATTR)
Python 3.11a5: 3485 (Add an oparg to GET_AWAITABLE)
Python 3.11a6: 3486 (Use inline caching for PRECALL and CALL)
Python 3.11a6: 3487 (Remove the adaptive “oparg counter” mechanism)
Python 3.11a6: 3488 (LOAD_GLOBAL can push additional NULL)
Python 3.11a6: 3489 (Add JUMP_BACKWARD, remove JUMP_ABSOLUTE)
Python 3.11a6: 3490 (remove JUMP_IF_NOT_EXC_MATCH, add CHECK_EXC_MATCH)
Python 3.11a6: 3491 (remove JUMP_IF_NOT_EG_MATCH, add CHECK_EG_MATCH,
add JUMP_BACKWARD_NO_INTERRUPT, make JUMP_NO_INTERRUPT virtual)
Python 3.11a7: 3492 (make POP_JUMP_IF_NONE/NOT_NONE/TRUE/FALSE relative)
Python 3.11a7: 3493 (Make JUMP_IF_TRUE_OR_POP/JUMP_IF_FALSE_OR_POP relative)
Python 3.11a7: 3494 (New location info table)
Python 3.11b4: 3495 (Set line number of module’s RESUME instr to 0 per PEP 626)
Python 3.12a1: 3500 (Remove PRECALL opcode)
Python 3.12a1: 3501 (YIELD_VALUE oparg == stack_depth)
Python 3.12a1: 3502 (LOAD_FAST_CHECK, no NULL-check in LOAD_FAST)
Python 3.12a1: 3503 (Shrink LOAD_METHOD cache)
Python 3.12a1: 3504 (Merge LOAD_METHOD back into LOAD_ATTR)
Python 3.12a1: 3505 (Specialization/Cache for FOR_ITER)
Python 3.12a1: 3506 (Add BINARY_SLICE and STORE_SLICE instructions)
Python 3.12a1: 3507 (Set lineno of module’s RESUME to 0)
Python 3.12a1: 3508 (Add CLEANUP_THROW)
Python 3.12a1: 3509 (Conditional jumps only jump forward)
Python 3.12a2: 3510 (FOR_ITER leaves iterator on the stack)
Python 3.12a2: 3511 (Add STOPITERATION_ERROR instruction)
Python 3.12a2: 3512 (Remove all unused consts from code objects)
Python 3.12a4: 3513 (Add CALL_INTRINSIC_1 instruction, removed STOPITERATION_ERROR, PRINT_EXPR, IMPORT_STAR)
Python 3.12a4: 3514 (Remove ASYNC_GEN_WRAP, LIST_TO_TUPLE, and UNARY_POSITIVE)
Python 3.12a5: 3515 (Embed jump mask in COMPARE_OP oparg)
Python 3.12a5: 3516 (Add COMPARE_AND_BRANCH instruction)
Python 3.12a5: 3517 (Change YIELD_VALUE oparg to exception block depth)
Python 3.12a6: 3518 (Add RETURN_CONST instruction)
Python 3.12a6: 3519 (Modify SEND instruction)
Python 3.12a6: 3520 (Remove PREP_RERAISE_STAR, add CALL_INTRINSIC_2)
Python 3.12a7: 3521 (Shrink the LOAD_GLOBAL caches)
Python 3.12a7: 3522 (Removed JUMP_IF_FALSE_OR_POP/JUMP_IF_TRUE_OR_POP)
Python 3.12a7: 3523 (Convert COMPARE_AND_BRANCH back to COMPARE_OP)
Python 3.12a7: 3524 (Shrink the BINARY_SUBSCR caches)
Python 3.12b1: 3525 (Shrink the CALL caches)
Python 3.12b1: 3526 (Add instrumentation support)
Python 3.12b1: 3527 (Add LOAD_SUPER_ATTR)
Python 3.12b1: 3528 (Add LOAD_SUPER_ATTR_METHOD specialization)
Python 3.12b1: 3529 (Inline list/dict/set comprehensions)
Python 3.12b1: 3530 (Shrink the LOAD_SUPER_ATTR caches)
Python 3.12b1: 3531 (Add PEP 695 changes)
Python 3.13: will start with 3550

需要注意的是,uncompyle6 仅支持到 Python 3.9,3.10及其之后的版本反编译会报错Unsupported Python version, 3.10.0, for decompilation

image.png

通过前面的 magic number 表找到对应反编译文件的 Python 版本,替换下面代码中的PYTHON_NUMBER值,运行后可以得到十六进制的 magic number 值

PYTHON_NUMBER = 3413
MAGIC_NUMBER = (PYTHON_NUMBER).to_bytes(2, 'little') + b'\r\n'
_RAW_MAGIC_NUMBER = int.from_bytes(MAGIC_NUMBER, 'little')
hex_value = hex(_RAW_MAGIC_NUMBER)      # 转换为16进制
hex_value_without_prefix = hex_value[2:]    # 去除0x前缀
bit_len = len(hex_value_without_prefix)
hex_len = bit_len % 2 + bit_len
magic_number_hex = hex_value.replace('0x','').zfill(hex_len).upper()
print(magic_number_hex)

image.png

然后按照下面的规则在 pyc 文件头部插入 magic number 值:

  • 在 Python 3.7 及以上版本,头部除了四字节Magic Number,还有四个字节的空位和八个字节的时间戳+大小信息,后者对文件反编译没有影响,全部填充0即可,即除Magic Number外额外增添12个字节全补充0
  • Python 3.3 - 3.7(包含3.3)版本中,只需要Magic Number和八位时间戳+大小信息
  • Python 3.3 以下的版本中,只有Magic Number和四位时间戳

用 010Editor 打开我们要修改的 pyc 文件,编辑菜单点击插入字节,按上面的规则如 Python 3.8 需要插入十进制 16 个字节

image.png

插入时需要注意,上面执行代码获得的结果 magic number 值是0A0D0D55,经过我的多次测试,直接插入这个值会有问题,需要将这个值进行字节首尾颠倒后再插入,插入的值变为550D0D0A,再补充12位0变为550D0D0A000000000000000000000000

image.png

再次通过 Uncompyle6 对 pyc 文件进行反编译,如果没报错并且输出了程序源码,说明反编译成功,magic number 没问题,如果反编译失败说明输入的 magic number出错

image.png

验证可以反编译成功,再将反编译成果保存到 py 文件中就大功告成

uncompyle6 xxx.pyc > xxx.py

参考文章:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值