反混淆

最新推荐文章于 2024-07-18 02:48:48 发布

tiantian1980

最新推荐文章于 2024-07-18 02:48:48 发布

阅读量4.6k

点赞数

文章标签： .net variables integer 反编译工具 transition encoding

首先让我想到的是查看网页源代码，结果一无所获。发现页面后缀都是.aspx。所以到网上找来了有关资料。了解此网站为.net架构，C#语言编写，主要代码都编译到一个DLL中，几经周折在网站BIN目录下找到这个主DLL，看来这个DLL就是破解的目标。
首先用PEID查壳，显示“Microsoft Visual C# / Basic .NET”。看起来没有壳，心中暗喜（结果从此掉进.net深渊）。随后用W32DASM载入分析，一无所获。又用OD加载分析，仍然一无所获。跑到朋友服务器上用OD跟踪调用DLL的进程w3p.exe，至到头晕眼花仍无结果。一气之下恶补了两天.net知识。明白了.net的基本运作模式，明白了什么是MSIL（.net里的汇编）。
对.net知识了解了一些以后，找来Reflector、C#、XenoFox 等.net逆向工具进行研究。最后终于借助Reflector+File插件把那网站的DLL反编译成了代码。本以为很难搞的.net，居然用个工具就能反编译出极接近原代码的代码。但经仔细分析后发现，原文件中几乎所有的类名、方法名都是类似‘49691e44a7a6b9b4’这样又长又不符合命名规范的字符串。用VS7建立了新的项目文件并编译了一下，果然，出现5000多个错误。进一步查看原文件，发现字符串全是/u3e4b这种形式表示的。查资料得知是JAVA形式的Unicode表示方法。找来格式转换器，转换后的文本都是乱码。真是让人郁闷。
仔细研究代码发现，所有/u字符串都是和一个整数一起当作参数传递给了一个类的方法。找到这个方法的原代码，却只有一行“//Decomplile err”。反编译错误。又找来其他几款类似工具都是同样结果。这时想到了 IL，找来反编译工具ildasm，把网站DLL反编译成IL文件。找出加密字符串用的方法的IL代码。对照IL指令表（文末附），分析了一下加密算法：

public static string cc381ffa3ede662f(string e4115acdf4fbfccc, int 211566702b710682)
{
    locals:
        V0: char[]
        V1: int

    ldarg.0    //加载参数0 到堆栈
    br.s       label_2

label_3:
    ldloc.0  // V0 入栈
    ldloc.1  // V1 入栈
    ldloc.0  // V0 入栈
    ldloc.1  // V1 入栈
    ldelem.u2  //将位于指定V1 处的 unsigned int16 类型的元素作为 int32 加载到计算堆栈的顶部。
    ldarg.1  //参数 1 入栈
    sub
    conv.u2  //将位于计算堆栈顶部的值转换为 unsigned int16，然后将其扩展为 int32。
    stelem.i2  //用计算堆栈上的 int16 值替换给定索引处的数组元素。
    ldarg.1    //加载参数1到堆栈
    ldc.i4     1789   //整形入栈
    add    //add two values, returning a new value
    starg.s    1  //   -------------------------------------------------   参数1 +1789
    ldloc.1    //V1入栈
    ldc.i4.1  //将整数值 1 作为 int32 推送到计算堆栈上
    add
    stloc.1  //出栈到V1 -----------------------------------------------  V1++
label_4:
    ldloc.1  // V1 入栈
    ldloc.0  // V0 入栈
    ldlen   //将从零开始的、一维数组的元素的数目推送到计算堆栈上。
    conv.i4  //堆栈顶部的值转成 int32
    blt.s       label_3   ---------------------- //循环判断，V1是否达到ARRAY元素数量，则将控制转移到label_3
    ldloc.0  //参数0 入栈
    newobj     String..ctor(char[])  //create a new object
    ret

label_2:
    callvirt   char[] String.ToCharArray()    //调用
    stloc.0  //出栈到  V0
    ldc.i4.0  //推 0 入栈
    stloc.1  //出栈到 V1     //把参数传给本地变量，int变量为0
    br.s       label_4
}

算法清楚以后，用VC模拟解密过程写了一个简单的解密工具。便开始一行一行解密字符串。怎奈工程巨大，整个DLL中有几千个字符串被加密。一行一行解密会累死人的。这时一个问题出现在脑海--作者加密的时候是怎样做的？便到网上找相关资料。这次了解了一些.net加密技术，了解了混淆的概念。但与反混淆相关的内容实在太少，只在MSDN上找到一个使用Reflection调用程序集中已有方法的文章。便模仿DLL反编译出的代码，尝试使用C#编程，历时几天时间写出了针对IL文件的字符串反混淆器（文末附）。

程序中的字符串变成了明文，很容易就找到了生成弹窗代码的函数，在IL里把函数体的第一行改为 ret ，使函数失效。
分析代码使得分析注册码算法很容易。此网站系统的注册码为一个加密的BASE64字符串。解密后内容类似 “www.sohu.com  A  2006-12-2 0:00:00  2008-1-1 2:28:24  A” 这样。一个字符串被4个制表符分成5部分，第一部分是网站域名，两个A是用户级别，日期是注册时间范围。
经过这近半个月的研究，使我迈进了 .net 的大门。C#的易学易用性、灵活性、和极高的开发效率令我惊叹！据CSDN上一些权威人士讲，不久的将来微软将全面淘汰win32api这种平面的编程接口，取代之的是.net类库这种全面立体化的编程平台。（让windows由虚拟机支持.net变为原生支持.net？）。而且.net的语言无关性（用公共运行库几乎统一了编程语言），平台无关性（windows mobile也出来了，JAVA迟早要倒掉）。使得WINDOWS阵营越来越强大。我们不得不紧跟时代的步伐，积极接受新生事物。在此也建议看雪能开一个专门的 .net逆向/破解版块。让更多的人加入.net的队伍。

附上几款.net下常用的工具（点击下载）：
ILASM和ILDASM（基本的.net程序反编译工具）
reflector（使用最多的.net逆向软件，我个人感觉一般）
Xenocode Fox 2007破解版（功能比Reflector强大很多，但是要求.net2.0而且只能生成VS2005的工程文件）
spices（反编译出的代码可以直接显示中文，可惜没有找到破解版，谁有给发一份，先谢了）
Dis# （类似工具里最强大的，其反混淆功能及其耀眼，可惜也没找到破解版）
IL文件字符串反混淆工具（针对XenoCode的字符串混淆，我的第一个.net程序，请大家少耻笑多指教）
MSIL指令对照表,发帖的是时候忘了，现在补上

********************************************************************************************

由浅至深,谈谈.NET混淆原理（一）

前段时间特别忙，没有时间更新自己的博客，也感到非常过意不去。可是我工作中的经历也许不是大家更感兴趣的话题，再加上framesniper兄把我拉进了 Inside IL and CLR 团队，虽嘴上说忙，但必须还要是做点贡献，所以正好赶上MaxtoCode 2.0 差不多快发布了，抽出几天时间，写写此领域的文章。

随便先说一下：凡是一个事物的存在，必然有存在的理由。有的朋友说：“你的代码没有价值，没有必要混淆，我承诺我是永远开源的”。对于这样的反驳我听了也不止一次，我觉得不能以个论全，开源是一件非常快乐的事情，可混淆器、加密器的存在是因为有这个需求。所以这样的话题是没有意义的，我也不会去争论。

呵呵，跑题了。。。

好，回到正题上，也许有很多人已经了解什么是混淆了，也知道混淆原理，不过我想应该有更多的人不知道，我们因为知道别人是怎么来处理混淆的，以及对混淆进行反向操作的，这样，我们才能更好的保护自己的知识产权。

我打算分为这么几个部分来试着谈谈.NET混淆原理

1. IL 基础，什么是IL

2. 最简单的混淆

3. 什么是流程混淆，它的利与弊

4. 反混淆实战（原理＋工具篇）

5. 新一代 .NET 代码保护加密工具 MaxtoCode 基本原理

6. 其它保护手段

好，今天我们来讲讲基础 ―― .NET 中的 IL

相信大家都知道不管你使用C#还是VB.NET还是C++ 托管，最后编译出来的都是IL语言程序集。

什么是IL呢，它是一种中间语言字节码，存在于高级语言和机器码的一种中间语言。它的作用就是建立“统一”运行的.NET运行环境，使net 可以跨平台（不过，从实际情况来看，MS是不会允许net跨平台的，至少3年内不会，甚至更长。其实，跨平台也没什么好的，看看Java，号称一次编译，到运行，结果变成一次编译，到处调试！我就在Windows系统下没见过大量用Java编写的好工具，也许是偶不经常关注它的原因吧！！）。

不好意思，刚刚又跑题了，近来思想老打岔，唉，这是个不好的现象。IL 的格式与汇编语言的格式极为相似，所不同的是，IL语言比汇编语言更加易懂，因为它里面可以直接调用已封装好的Object，而且运行逻辑也与高级语言一致，所以基本上是差不多的。

我们不能对IL做一个系统的介绍，所以我们用一段非常简单的C#代码，让我们看看：（凡事从简单入手，熟悉后再开始复杂）

这段代码没有意义，我只是为了增加运算量，做强度测试的时候再这样写的。我们看看这段代码被译为 IL 将是什么模样。C#的代码对比一下，基本上还是比较清楚的，可能有的朋友已经被ldarg、starg、ldloc、stloc搞糊涂了，呵呵，其实看熟释了就好了，他可比Mov好清楚的多啊，后面所跟的变量所指也比EAX等寄存器清楚的我。

.method public hidebysig instance int32 Level3(int32 a) cil managed

// .method 是说这个区域是方法区域指的是｛｝中的内容

// public hidebysig instance 是此方法的属性

// int32 是这个方法的反回值，如果是VB.NET中的 sub 在这里翻译出来返回值为 void

// Level3 是方法名称，与原代码一至

// int32 a 是进入的参数，与原代码一至

// cil managed 是托管方法

// 由于net的一大特性就是MetaData，而它带上了许多的程序信息，所以基本上，il与C#很相以。还是一句老话嘛，凡事有利必有弊。

{

.maxstack 2 // 最大的堆数量2

// 此值是能过代码中的交换需求计算而来的

.locals init ([0] string s, // 交换变量的类型定义，这里可以看得很清楚。

// 三个变量与一个参数(或返回值)都在这里

[1] unsigned int8[] b,

[2] class [mscorlib]System.Text.ASCIIEncoding asii,

[3] int32 CS$00000003$00000000)

// 下面是代码区

IL_0000: ldstr "215dsgfdart42315s" // 赋值字符器

IL_0005: stloc.0 // 赋值给变量感觉如 push

IL_0006: newobj instance void [mscorlib]System.Text.ASCIIEncoding::.ctor()

// 建立一个System.Text.ASCIIEncoding对象

IL_000b: stloc.2 // 赋值给变量

IL_000c: ldloc.2 // 取出 System.Text.ASCIIEncoding对象感觉如 pop

IL_000d: ldloc.0 // 取出字符串

IL_000e: callvirt instance unsigned int8[] [mscorlib]System.Text.Encoding::GetBytes(string) // 进行转换

IL_0013: stloc.1 // 将结果给 byte[]

IL_0014: ldarg.1 //

IL_0015: ldloc.1 // 取出byte[]

IL_0016: ldlen // 计算长度

IL_0017: conv.i4 //

IL_0018: add // 与 a 相加

IL_0019: starg.s a

IL_001b: ldarg.0 //

IL_001c: ldarg.1

IL_001d: call instance int32 Name1.strong::Level4(int32) //调用 Level4方法

IL_0022: starg.s a

IL_0024: ldarg.1

IL_0025: stloc.3

IL_0026: br.s IL_0028

// 我不知道这里为什么会出现这一句，这一句完全是没必要的

IL_0028: ldloc.3

IL_0029: ret // ret 表示方法结果，如果上面有入栈值，则当成返回变量

}

这样，根据上面

下面是以上四个指令的官方说明：

ldarg.<length> - load argument onto the stack

Format	Assembly Format	Description
FE 09 <unsigned int16>	ldarg num	Load argument numbered num onto stack.
0E <unsigned int8>	ldarg.s num	Load argument numbered num onto stack, short form.
02	ldarg.0	Load argument 0 onto stack
03	ldarg.1	Load argument 1 onto stack
04	ldarg.2	Load argument 2 onto stack
05	ldarg.3	Load argument 3 onto stack

Stack Transition:

… à …, value

Description:

The ldarg num instruction pushes the num’th incoming argument, where arguments are numbered 0 onwards (see Partition I_alink_partitionI) onto the evaluation stack. The ldarg instruction can be used to load a value type or a built-in value onto the stack by copying it from an incoming argument. The type of the value is the same as the type of the argument, as specified by the current method’s signature.

The ldarg.0, ldarg.1, ldarg.2, and ldarg.3 instructions are efficient encodings for loading any of the first 4 arguments. The ldarg.s instruction is an efficient encoding for loading argument numbers 4 through 255.

For procedures that take a variable-length argument list, the ldarg instructions can be used only for the initial fixed arguments, not those in the variable part of the signature. (See the arglist instruction)

Arguments that hold an integer value smaller than 4 bytes long are expanded to type int32 when they are loaded onto the stack. Floating-point values are expanded to their native size (type F).

Exceptions:

None.

Verifiability:

Correct CIL guarantees that num is a valid argument index. See Section 1.5_1.5_OperandTypeTable for more details on how verification determines the type of the value loaded onto the stack.

starg.<length> - store a value in an argument slot

Format	Assembly Format	Description
FE 0B <unsigned int16>	starg num	Store a value to the argument numbered num
10 <unsigned int8>	starg.s num	Store a value to the argument numbered num, short form

Stack Transition:

… value à …,

Description:

The starg num instruction pops a value from the stack and places it in argument slot num (see Partition I_alink_partitionI). The type of the value must match the type of the argument, as specified in the current method’s signature. The starg.s instruction provides an efficient encoding for use with the first 256 arguments.

For procedures that take a variable argument list, the starg instructions can be used only for the initial fixed arguments, not those in the variable part of the signature.

Storing into arguments that hold an integer value smaller than 4 bytes long truncates the value as it moves from the stack to the argument. Floating-point values are rounded from their native size (type F) to the size associated with the argument.

Exceptions:

None.

Verifiability:

Correct CIL requires that num is a valid argument slot.

Verification also checks that the verification type of value matches the type of the argument, as specified in the current method’s signature (verification types are less detailed than CLI types).

ldloc - load local variable onto the stack

Format	Assembly Format	Description
FE 0C<unsigned int16>	ldloc indx	Load local variable of index indx onto stack.
11 <unsigned int8>	ldloc.s indx	Load local variable of index indx onto stack, short form.
06	ldloc.0	Load local variable 0 onto stack.
07	ldloc.1	Load local variable 1 onto stack.
08	ldloc.2	Load local variable 2 onto stack.
09	ldloc.3	Load local variable 3 onto stack.

Stack Transition:

… à …, value

Description:

The ldloc indx instruction pushes the contents of the local variable number indx onto the evaluation stack, where local variables are numbered 0 onwards. Local variables are initialized to 0 before entering the method only if the initialize flag on the method is true (see Partition I_alink_partitionI). The ldloc.0, ldloc.1, ldloc.2, and ldloc.3 instructions provide an efficient encoding for accessing the first four local variables. The ldloc.s instruction provides an efficient encoding for accessing local variables 4 through 255.

The type of the value is the same as the type of the local variable, which is specified in the method header. See Partition I_alink_partitionI.

Local variables that are smaller than 4 bytes long are expanded to type int32 when they are loaded onto the stack. Floating-point values are expanded to their native size (type F).

Exceptions:

VerificationException is thrown if the the “zero initialize” bit for this method has not been set, and the assembly containing this method has not been granted SecurityPermission.SkipVerification (and the CIL does not perform automatic definite-assignment analysis)

Verifiability:

Correct CIL ensures that indx is a valid local index. See Section 1.5_1.5_OperandTypeTable for more details on how verification determines the type of a local variable. For the ldloca indx instruction, indx must lie in the range 0 to 65534 inclusive (specifically, 65535 is not valid)

Rationale: The reason for excluding 65535 is pragmatic: likely implementations will use a 2-byte integer to track both a local’s index, as well as the total number of locals for a given method. If an index of 65535 had been made legal, it would require a wider integer to track the number of locals in such a method.

Also, for verifiable code, this instruction must guarantee that it is not loading an uninitialized value – whether that initialization is done explicitly by having set the “zero initialize” bit for the method, or by previous instructions (where the CLI performs definite-assignment analysis)

stloc - pop value from stack to local variable

Format	Assembly Format	Description
FE 0E <unsigned int16>	stloc indx	Pop value from stack into local variable indx.
13 <unsigned int8>	stloc.s indx	Pop value from stack into local variable indx, short form.
0A	stloc.0	Pop value from stack into local variable 0.
0B	stloc.1	Pop value from stack into local variable 1.
0C	stloc.2	Pop value from stack into local variable 2.
0D	stloc.3	Pop value from stack into local variable 3.

Stack Transition:

…, value à …

Description:

The stloc indx instruction pops the top value off the evalution stack and moves it into local variable number indx (see Partition I_alink_partitionI), where local variables are numbered 0 onwards. The type of value must match the type of the local variable as specified in the current method’s locals signature. The stloc.0, stloc.1, stloc.2, and stloc.3 instructions provide an efficient encoding for the first four local variables; the stloc.s instruction provides an efficient encoding for local variables 4 through 255.

Storing into locals that hold an integer value smaller than 4 bytes long truncates the value as it moves from the stack to the local variable. Floating-point values are rounded from their native size (type F) to the size associated with the argument.

Exceptions:

None.

Verifiability:

Correct CIL requires that indx is a valid local index. For the stloc indx instruction, indx must lie in the range 0 to 65534 inclusive (specifically, 65535 is not valid)

Verification also checks that the verification type of value matches the type of the local, as specified in the current method’s locals signature.

所有的官方文档皆在：D:/Program Files/Microsoft Visual Studio .NET 2003/SDK/v1.1/Tool Developers Guide/docs。有兴趣的朋友可以阅读一番。

public int Level3( int a)
2

{
4

string s = "215dsgfdart42315s"; // 定义一个字符串
6

byte[] b;
8

System.Text.ASCIIEncoding asii = new System.Text.ASCIIEncoding();
10

b = asii.GetBytes(s); // 将字符串转换为 byte〔〕
12

a = a + b.Length; // 然后我要取出 byte的长度，其实就是字符串的长度
14

a = Level4(a); // 调用另一个函数
16

return a; // 返回 a
18

}
20

tiantian1980

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
反混淆

首先让我想到的是查看网页源代码，结果一无所获。发现页面后缀都是.aspx。所以到网上找来了有关资料。了解此网站为.net架构，C#语言编写，主要代码都编译到一个DLL中，几经周折在网站BIN目录下找到这个主DLL，看来这个DLL就是破解的目标。首先用PEID查壳，显示“Microsoft Visual C# / Basic .NET”。看起来没有壳，心中暗喜（结果从此掉进.net深渊）。随后用W
复制链接

扫一扫