编程群:33304738(满员) 图像群:33304789(满员) CCNA/NP群:36010403 基础群:17256401(满员) 新开群(VB群:23613764 .NET群:11698457 DELPHI群:15561135 JAVA群:21058003 数据库群:12541752 3D群:21058051 平面群:21058028) 06/05/07 11:46:08 YOU IP:210.51.173.165
也许有很多人已经了解什么是混淆了,也知道混淆原理,不过我想应该有更多的人不知道,我们因为知道别人是怎么来处理混淆的,以及对混淆进行反向操作的,这样,我们才能更好的保护自己的知识产权。
我打算分为这么几个部分来试着谈谈 .NET 混淆原理
• IL 基础,什么是 IL
• 最简单的混淆
• 什么是流程混淆,它的利与弊
• 反混淆实战 (原理 + 工具篇)
• 新一代 .NET 代码保护加密工具 MaxtoCode 基本原理
• 其它保护手段
好,今天我们来讲讲基础 ―― .NET 中的 IL (仅跟例题有关的IL指令,其它指令请参考Msdn)
相信大家都知道不管你使用 C# 还是 VB.NET 还是 C++ 托管,最后编译出来的都是 IL 语言程序集。
什么是 IL 呢,它是一种中间语言字节码,存在于高级语言和机器码的一种中间语言。它的作用就是建立“统一”运行的 .NET 运行环境,使 net 可以跨平台 (不过,从实际情况来看, MS 是不会允许 net 跨平台的,至少 3 年内不会,甚至更长。其实,跨平台也没什么好的,看看 Java ,号称一次编译,到运行,结果变成一次编译,到处调试!我就在 Windows 系统下没见过大量用 Java 编写的好工具,也许是偶不经常关注它的原因吧!!)。
不好意思,刚刚又跑题了,近来思想老打岔,唉,这是个不好的现象。 IL 的格式与汇编语言的格式极为相似,所不同的是, IL 语言比汇编语言更加易懂,因为它里面可以直接调用已封装好的 Object ,而且运行逻辑也与高级语言一致,所以基本上是差不多的。
我们不能对 IL 做一个系统的介绍,所以我们用一段非常简单的 C# 代码,让我们看看:(凡事从简单入手,熟悉后再开始复杂)
public int Level3( int a)
{
string s = "215dsgfdart42315s"; // 定义一个字符串
byte [] b;
System.Text.ASCIIEncoding asii = new System.Text.ASCIIEncoding();
b = asii.GetBytes(s); // 将字符串转换为 byte 〔〕
a = a + b.Length; // 然后我要取出 byte 的长度,其实就是字符串的长度
a = Level4(a); // 调用另一个函数
return a; // 返回 a
}
这段代码没有意义,我只是为了增加运算量,做强度测试的时候再这样写的。我们看看这段代码被译为 IL 将是什么模样。
.method public hidebysig instance int32 Level3(int 32 a ) cil managed
// .method 是说这个区域 是方法 区域指的是 {} 中的内容
// public hidebysig instance 是此方法的属性
// int32 是这个方法的反回值,如果是 VB.NET 中的 sub 在这里翻译出来返回值为 void
// Level3 是方法名称,与原代码一至
// int 32 a 是进入的参数,与原代码一至
// cil managed 是托管方法
// 由于 net 的一大特性就是 MetaData ,而它带上了许多的程序信息,所以基本上, il 与 C# 很相以。还是一句老话嘛,凡事有利必有弊。
{
.maxstack 2 // 最大的堆 数量 2
// 此值是能过代码中的交换需求计算而来的
.locals init ([0] string s, // 交换变量的类型定义,这里可以看得很清楚。
// 三个变量与一个参数 ( 或返回值 ) 都在这里
[1] unsigned int8[] b,
[2] class [mscorlib]System.Text.ASCIIEncoding asii,
[3] int32 CS$00000003$00000000)
// 下面是代码区
IL_0000: ldstr "215dsgfdart42315s" // 赋值字符器
IL_0005: stloc.0 // 赋值给变量 感觉如 push
IL_0006: newobj instance void [mscorlib]System.Text.ASCIIEncoding::.ctor()
// 建立一个 System.Text.ASCIIEncoding 对象
IL_000b: stloc.2 // 赋值给变量
IL_ 000c : ldloc.2 // 取出 System.Text.ASCIIEncoding 对象 感觉如 pop
IL_000d: ldloc.0 // 取出 字符串
IL_000e: callvirt instance unsigned int8[] [mscorlib]System.Text.Encoding::GetBytes(string) // 进行转换
IL_0013: stloc.1 // 将结果给 byte[]
IL_0014: ldarg.1 //
IL_0015: ldloc.1 // 取出 byte[]
IL_0016: ldlen // 计算长度
IL_0017: conv.i4 //
IL_0018: add // 与 a 相加
IL_0019: starg.s a
IL_001b: ldarg.0 //
IL_ 001c : ldarg.1
IL_001d: call instance int32 Name1.strong::Level4(int32) // 调用 Level4 方法
IL_0022: starg.s a
IL_0024: ldarg.1
IL_0025: stloc.3
IL_0026: br.s IL_0028
// 我不知道这里为什么会出现这一句,这一句完全是没必要的
IL_0028: ldloc.3
IL_0029: ret // ret 表示方法结果,如果上面有入栈值,则当成返回变量
}
这样,根据上面 C# 的代码对比一下,基本上还是比较清楚的,可能有的朋友已经被 ldarg 、 starg 、 ldloc 、 stloc 搞糊涂了,呵呵,其实看熟释了就好了,他可比 Mov 好清楚的多啊,后面所跟的变量所指也比 EAX 等寄存器清楚的我。
下面是以上四个指令的官方说明:
• ldarg.<length> - load argument onto the stack • starg.<length> - store a value in an argument slot • ldloc - load local variable onto the stack • stloc - pop value from stack to local variable
Format | Assembly Format | Description |
FE 09 <unsigned int16> | ldarg num | Load argument numbered num onto stack. |
0E <unsigned int8> | ldarg.s num | Load argument numbered num onto stack, short form. |
02 | ldarg.0 | Load argument 0 onto stack |
03 | ldarg.1 | Load argument 1 onto stack |
04 | ldarg.2 | Load argument 2 onto stack |
05 | ldarg.3 | Load argument 3 onto stack |
Stack Transition:
… à …, value
Description:
The ldarg num instruction pushes the num 'th incoming argument, where arguments are numbered 0 onwards (see Partition I _alink_partitionI ) onto the evaluation stack. The ldarg instruction can be used to load a value type or a built-in value onto the stack by copying it from an incoming argument. The type of the value is the same as the type of the argument, as specified by the current method's signature.
The ldarg.0 , ldarg.1 , ldarg.2 , and ldarg.3 instructions are efficient encodings for loading any of the first 4 arguments. The ldarg.s instruction is an efficient encoding for loading argument numbers 4 through 255.
For procedures that take a variable-length argument list, the ldarg instructions can be used only for the initial fixed arguments, not those in the variable part of the signature. (See the arglist instruction)
Arguments that hold an integer value smaller than 4 bytes long are expanded to type int32 when they are loaded onto the stack. Floating-point values are expanded to their native size (type F ).
Exceptions:
None.
Verifiability:
Correct CIL guarantees that num is a valid argument index. See Section 1.5 _1.5_OperandTypeTable for more details on how verification determines the type of the value loaded onto the stack.
Format | Assembly Format | Description |
FE 0B < unsigned int16 > | starg num | Store a value to the argument numbered num |
10 < unsigned int8 > | starg.s num | Store a value to the argument numbered num , short form |
Stack Transition:
… value à …,
Description:
The starg num instruction pops a value from the stack and places it in argument slot num (see Partition I _alink_partitionI ). The type of the value must match the type of the argument, as specified in the current method's signature. The starg.s instruction provides an efficient encoding for use with the first 256 arguments.
For procedures that take a variable argument list, the starg instructions can be used only for the initial fixed arguments, not those in the variable part of the signature.
Storing into arguments that hold an integer value smaller than 4 bytes long truncates the value as it moves from the stack to the argument. Floating-point values are rounded from their native size (type F ) to the size associated with the argument.
Exceptions:
None.
Verifiability:
Correct CIL requires that num is a valid argument slot.
Verification also checks that the verification type of value matches the type of the argument, as specified in the current method's signature (verification types are less detailed than CLI types).
Format | Assembly Format | Description |
FE 0C <unsigned int16> | ldloc indx | Load local variable of index indx onto stack. |
11 <unsigned int8> | ldloc.s indx | Load local variable of index indx onto stack, short form. |
06 | ldloc.0 | Load local variable 0 onto stack. |
07 | ldloc.1 | Load local variable 1 onto stack. |
08 | ldloc.2 | Load local variable 2 onto stack. |
09 | ldloc.3 | Load local variable 3 onto stack. |
Stack Transition:
… à …, value
Description:
The ldloc indx instruction pushes the contents of the local variable number indx onto the evaluation stack, where local variables are numbered 0 onwards. Local variables are initialized to 0 before entering the method only if the initialize flag on the method is true (see Partition I _alink_partitionI ). The ldloc.0 , ldloc.1 , ldloc.2 , and ldloc.3 instructions provide an efficient encoding for accessing the first four local variables. The ldloc.s instruction provides an efficient encoding for accessing local variables 4 through 255.
The type of the value is the same as the type of the local variable, which is specified in the method header. See Partition I _alink_partitionI .
Local variables that are smaller than 4 bytes long are expanded to type int32 when they are loaded onto the stack. Floating-point values are expanded to their native size (type F ).
Exceptions:
VerificationException is thrown if the the “zero initialize” bit for this method has not been set, and the assembly containing this method has not been granted SecurityPermission.SkipVerification (and the CIL does not perform automatic definite-assignment analysis)
Verifiability :
Correct CIL ensures that indx is a valid local index. See Section 1.5 _1.5_OperandTypeTable for more details on how verification determines the type of a local variable. For the ldloca indx instruction, i ndx must lie in the range 0 to 65534 inclusive (specifically, 65535 is not valid)
Rationale: The reason for excluding 65535 is pragmatic: likely implementations will use a 2-byte integer to track both a local's index, as well as the total number of locals for a given method. If an index of 65535 had been made legal, it would require a wider integer to track the number of locals in such a method.
Also, for verifiable code, this instruction must guarantee that it is not loading an uninitialized value – whether that initialization is done explicitly by having set the “zero initialize” bit for the method, or by previous instructions (where the CLI performs definite-assignment analysis)
Format | Assembly Format | Description |
FE 0E <unsigned int16> | stloc indx | Pop value from stack into local variable indx. |
13 <unsigned int8> | stloc.s indx | Pop value from stack into local variable indx , short form. |
0A | stloc.0 | Pop value from stack into local variable 0 . |
0B | stloc.1 | Pop value from stack into local variable 1 . |
0C | stloc.2 | Pop value from stack into local variable 2 . |
0D | stloc.3 | Pop value from stack into local variable 3 . |
Stack Transition:
…, value à …
Description:
The stloc indx instruction pops the top value off the evalution stack and moves it into local variable number indx (see Partition I _alink_partitionI ), where local variables are numbered 0 onwards. The type of value must match the type of the local variable as specified in the current method's locals signature. The stloc.0 , stloc.1 , stloc.2 , and stloc.3 instructions provide an efficient encoding for the first four local variables; the stloc.s instruction provides an efficient encoding for local variables 4 through 255.
Storing into locals that hold an integer value smaller than 4 bytes long truncates the value as it moves from the stack to the local variable. Floating-point values are rounded from their native size (type F ) to the size associated with the argument.
Exceptions:
None.
Verifiability:
Correct CIL requires that indx is a valid local index. For the stloc indx instruction, i ndx must lie in the range 0 to 65534 inclusive (specifically, 65535 is not valid)
Rationale: The reason for excluding 65535 is pragmatic: likely implementations will use a 2-byte integer to track both a local's index, as well as the total number of locals for a given method. If an index of 65535 had been made legal, it would require a wider integer to track the number of locals in such a method.
Verification also checks that the verification type of value matches the type of the local, as specified in the current method's locals signature.
所有的官方文档皆在: D:/Program Files/Microsoft Visual Studio .NET 2003/SDK/v1.1/Tool Developers Guide/docs 。有兴趣的朋友可以阅读一番。