MSIL - the language of the CLR (Part 3)

Introduction

In this the 3rd and final part of the MSIL series we will look at the various high level language constructs and how they are represented in MSIL via a series of practical examples.

ILDasm.exe

This tool disassembles a managed assembly and allows you to look at the MSIL generated for methods and so on, as well as the metadata. This tool is in GUI form. This tool is found in the same folder as ilasm.exe.

Sample programmes

I feel like we’ve talked about a lot of the fluffy stuff to this point, I felt that the background knowledge was both necessary and will suit you well for the future when you start to analyze managed modules yourself when working on your own software projects.

Just for fun (yes for fun!) let’s look at a simple program written in C#, referring to the following as a program is probably a bit overkill – it merely prints out the numbers 0..9, the language construct used is a for loop.

using System;

namespace ConsoleApplication1
{
class Program
{
static void Main()
{
for (int i = 0; i < 10; i++)
{
Console.WriteLine(i);
}
}
}
}


If you were to disassemble the binary for this program you would see that the generated MSIL is probably not that clear, yet still very simple to follow. We will replicate this in MSIL now; however I have taken the liberty to make the implementation a lot clearer splitting the code into two labels: Expr, and Body. The Expr label simply tests that i is less than 10, if the expression evaluates to true then the Body will be executed, the Body simply prints the integer to the console and then branches to the Expr again until the expression evaluates to false and we return handing back to the caller.

.assembly extern mscorlib {}

.assembly lesthan
{
.ver 1:0:0:0
}

.module LessThan.exe

.method static void main()
{
.entrypoint
.maxstack 2

.locals init (int32, int32)

ldc.i4 10
stloc.0
ldc.i4 0
stloc.1

Expr:
ldloc.1
ldloc.0
blt Body
ret

Body:
ldloc.1
call void [mscorlib]System.Console::WriteLine(int32)
ldc.i4 1
ldloc.1
add
stloc.1
br Expr
}

Hopefully that example was clear enough for you to grasp some fundamental control constructs within MSIL.

Namespaces

Namespaces are easy to specify in MSIL, using .namespace Xxx where Xxx is the identifier of the namespace is all that is required.

.assembly extern mscorlib {}

.assembly lesthan
{
.ver 1:0:0:0
}

.module LessThan.exe
.namespace Channel8
{

.method static void main()
{
// ...
}
}

Classes

In MSIL all types are fully qualified, that is they are referred to by Namespace.TypeName, and all types explicitly derive from System.Object – in a high level language we are aware of this but we don’t have to explicitly derive from System.Object, we take this inheritance chain for granted. Remember at the MSIL level we have to be a lot more atomic about what we are doing, everything is explicit.

The following is an example of a constructor taking two string args and then assigning the values of the args to their respective private fields.

.method public hidebysig specialname rtspecialname
instance void .ctor(string firstName,
string lastName) cil managed
{
// Code size 24 (0x18)
.maxstack 8
IL_0000: ldarg.0
IL_0001: call instance void [mscorlib]System.Object::.ctor()
IL_0006: nop
IL_0007: nop
IL_0008: ldarg.0
IL_0009: ldarg.1
IL_000a: stfld string ClassLibrary1.Person::_firstName
IL_000f: ldarg.0
IL_0010: ldarg.2
IL_0011: stfld string ClassLibrary1.Person::_lastName
IL_0016: nop
IL_0017: ret
} // end of method Person::.ctor

You may have noticed that .ctor is a special reserved name.

Methods

Methods are easy to identify as they are denoted by .method followed by the appropriate method signature.

The following method example actually started life off as a property in my high level language C#, but properties are a high level language abstraction – their implementation in MSIL followers the pattern get_Xxx and set_Xxx where Xxx is the name of the property.

.method public hidebysig specialname instance string
get_LastName() cil managed
{
// Code size 12 (0xc)
.maxstack 1
.locals init ([0] string CS$1$0000)
IL_0000: nop
IL_0001: ldarg.0
IL_0002: ldfld string ClassLibrary1.Person::_lastName
IL_0007: stloc.0
IL_0008: br.s IL_000a
IL_000a: ldloc.0
IL_000b: ret
} // end of method Person::get_LastName

Notice that the get_LastName() method is associated with an instance of the Person class.

Variables

In the loop example we have two variables, the first stored at location 0 the second at location 1, the first having the value 10 the latter having the value 0.

To give the variables a name (and probably what I should of done) is easy, you could use the following instead of the original code:

.assembly extern mscorlib {}

.assembly lesthan
{
.ver 1:0:0:0
}

.module LessThan.exe

.method static void main()
{
.entrypoint
.maxstack 2

.locals init (int32 boundary, int32 i)

ldc.i4 10
stloc boundary
ldc.i4 0
stloc i

Expr:
ldloc i
ldloc boundary
blt Body
ret

Body:
ldloc i
call void [mscorlib]System.Console::WriteLine(int32)
ldc.i4 1
ldloc i
add
stloc i
br Expr
}

The previous example used the variable names i for the counter, and boundary to hold the value of the boundary number for our for loop.

Application entry point

Every executable (.exe) has a single entry point, in MSIL this is simply a case of using the .entrypoint operation in the method that you want as the entry point for your program.

.method private hidebysig static void Main() cil managed
{
.entrypoint
// ...
IL_0017: ret
} // end of method Program::Main

Calling types/methods defined in external assemblies

When calling anything that is defined in another assembly you must not only fully qualify the type or method name, but also prefix the type or method with the name of the external assembly where it is defined in square brackets.

.method private hidebysig static void Main() cil managed
{
.entrypoint
// Code size 19 (0x13)
.maxstack 1
.locals init ([0] class [ClassLibrary1]ClassLibrary1.Class1 c)
IL_0000: nop
IL_0001: ldstr "Granville"
IL_0006: call void [mscorlib]System.Console::WriteLine(string)
IL_000b: nop
IL_000c: newobj instance void [ClassLibrary1]ClassLibrary1.Class1::.ctor()
IL_0011: stloc.0
IL_0012: ret
} // end of method Program::Main

In the above code we prefix a call to System.Console::WriteLine (static method) with [mscorlib] as that method is defined in the mscorlib assembly, similarly I have a trivial class called Class1 which is defined in an assembly called ClassLibrary1 hence the prefix of [ClassLibrary1].

Remember that our manifest defines al l the external assemblies that we are referencing, so if you examine the manifest you will see two references, one to mscorlib (always there) and the other to ClassLibrary1.

.assembly extern mscorlib
{
.publickeytoken = (B7 7A 5C 56 19 34 E0 89 ) // .z/V.4..
.ver 2:0:0:0
}
.assembly extern ClassLibrary1
{
.ver 1:0:0:0
}

Compiler differentiations in generated MSIL

Generally the MSIL generated by the VB.NET and C# compilers is somewhat identical; C++/CLI on the other hand can generate MSIL that is somewhat different from C# or VB.NET equivalents, e.g.

using System;

namespace ConsoleApplication1
{
class Program
{
static void Main()
{
Console.WriteLine("Factorial of 5: {0}", Factorial(5));
}

public static int Factorial(int n)
{
if (n == 0) return 1;
return n*Factorial(n - 1);
}

}
}

And:

using namespace System;

int Factorial(int n)
{
if (n == 0) return 1;
return n * Factorial(n-1);
}

int main()
{
Console::WriteLine(L"Factorial of 5: {0}", Factorial(5));
return 0;
}

Generate slightly different MSIL. The C++/CLI version uses 14 MSIL instructions for Factorial where as the C# version uses 22 both are built in debug mode – the C# version has one nop instruction.

Summary

Hopefully this quick miniseries has given you the knowledge required to understand the intermediate language used by the CLR – MSIL. It would be unreasonable to expect you to program in MSIL not only is it tedious and slow, but the high level language compilers generally create highly optimized MSIL when using release builds.

Take your knowledge forward by stripping programs down and seeing how the language constructs are implemented, a few suggestions include lambda expressions and smart properties – both constructs are very different at the implementation level than their high level language counterparts. 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
提供的源码资源涵盖了安卓应用、小程序、Python应用和Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码中配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程中,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码中的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值