【深入浅出C# async/await】编译篇
原文链接
Part1 - 【深入浅出C# async/await】编译篇
Part2 - 【深入浅出C# async/await】理解 awaitable-awaiter 模式
现在C#中加入了 async / await 关键字。就像 F# 中的 async 和 ! ,这个 C# 新的特性为我们提供了很大的便利。已经有很多关于如何在特定场景中使用 async / await 的很赞的文档,比如在 using async methods in ASP.NET 4.5以及in ASP.NET MVC 4在等。本文将探索语法糖(syntax suger)背后的实际代码实现。
As MSDN stated:
The async modifier indicates that the method, lambda expression, or anonymous method that it modifies is asynchronous.
async 修饰符表示它所修饰的方法、lambda 表达式或匿名方法是异步的。
Preparation
首先,需要编写一些帮助方法。
internal class HelperMethods
{
private static void IO()
{
using (WebClient client = new WebClient())
{
Enumerable.Repeat("http://weblogs.asp.net/dixin", 10).Select(client.DownloadString).ToArray();
}
}
internal static int Method(int arg0, int arg1)
{
int result = arg0 + arg1;
IO(); // Do some long running IO.
return result;
}
internal static Task<int> MethodTask(int arg0, int arg1)
{
Task<int> task = new Task<int>(() => Method(arg0, arg1));
task.Start(); // Hot task (started task) should always be returned.
return task;
}
internal static void Before()
{
}
internal static void Continuation1(int arg)
{
}
internal static void Continuation2(int arg)
{
}
}
这里的Method()是一个执行一些IO操作的长时间运行的方法。然后MethodTask()把它包装成一个Task并返回那个Task。这里没什么特别的。
Await something in async method
因为MethodTask()返回Task,让我们试着去await它。
internal class AsyncMethods
{
internal static async Task<int> MethodAsync(int arg0, int arg1)
{
int result = await HelperMethods.MethodTask(arg0, arg1);
return result;
}
}
因为在方法体中使用了await关键字,所以必须在方法上加async关键字。现在第一个异步方法出现了。根据命名约定,它有async后缀。当然,作为一个异步方法,它本身可以被等待。所以这里有一个 CallMethodAsync() 来调用 MethodAsync():
internal class AsyncMethods
{
internal static async Task<int> CallMethodAsync(int arg0, int arg1)
{
int result = await MethodAsync(arg0, arg1);
return result;
}
}
在编译后,MethodAsync() 和 CallMethodAsync() 会有相同的逻辑。下面是 MethodAsync() 的代码:
internal class CompiledAsyncMethods
{
[DebuggerStepThrough]
[AsyncStateMachine(typeof(MethodAsyncStateMachine))] // async
internal static /*async*/ Task<int> MethodAsync(int arg0, int arg1)
{
MethodAsyncStateMachine methodAsyncStateMachine = new MethodAsyncStateMachine()
{
Arg0 = arg0,
Arg1 = arg1,
Builder = AsyncTaskMethodBuilder<int>.Create(),
State = -1
};
methodAsyncStateMachine.Builder.Start(ref methodAsyncStateMachine);
return methodAsyncStateMachine.Builder.Task;
}
}
async关键字消失了。它只是创建并启动了一个状态机 MethodAsyncStateMachine,所有的实际逻辑都被移动到了那个状态机中:
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
internal struct MethodAsyncStateMachine : IAsyncStateMachine
{
public int State;
public AsyncTaskMethodBuilder<int> Builder;
public int Arg0;
public int Arg1;
public int Result;
private TaskAwaiter<int> awaitor;
void IAsyncStateMachine.MoveNext()
{
try
{
if (this.State != 0)
{
this.awaitor = HelperMethods.MethodTask(this.Arg0, this.Arg1).GetAwaiter();
if (!this.awaitor.IsCompleted)
{
this.State = 0;
this.Builder.AwaitUnsafeOnCompleted(ref this.awaitor, ref this);
return;
}
}
else
{
this.State = -1;
}
this.Result = this.awaitor.GetResult();
}
catch (Exception exception)
{
this.State = -2;
this.Builder.SetException(exception);
return;
}
this.State = -2;
this.Builder.SetResult(this.Result);
}
[DebuggerHidden]
void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
{
this.Builder.SetStateMachine(param0);
}
}
生成的代码已经被整理过了,以便它是可读的并且可以被编译。这里可以观察到几件事:
- async 修饰符消失了,这说明与其他修饰符(如static)不同,IL/CLR层面并没有“async”这样的东西。它变成了一个 AsyncStateMachineAttribute。这类似于扩展方法的编译方式。
- 生成的状态机与C# yield语法糖背后的状态机非常相似。
- 本地变量(arg0,arg1,result)被编译成状态机的字段。
- 真正的代码(await HelperMethods.MethodTask(arg0, arg1))被编译成MoveNext()中的:HelperMethods.MethodTask(this.Arg0, this.Arg1).GetAwaiter()。
CallMethodAsync() 将创建并启动它自己的状态机 CallMethodAsyncStateMachine:
internal class CompiledAsyncMethods
{
[DebuggerStepThrough]
[AsyncStateMachine(typeof(CallMethodAsyncStateMachine))] // async
internal static /*async*/ Task<int> CallMethodAsync(int arg0, int arg1)
{
CallMethodAsyncStateMachine callMethodAsyncStateMachine = new CallMethodAsyncStateMachine()
{
Arg0 = arg0,
Arg1 = arg1,
Builder = AsyncTaskMethodBuilder<int>.Create(),
State = -1
};
callMethodAsyncStateMachine.Builder.Start(ref callMethodAsyncStateMachine);
return callMethodAsyncStateMachine.Builder.Task;
}
}
CallMethodAsyncStateMachine 和 上面的MethodAsyncStateMachine有相同的逻辑。状态机细节待会再讨论。现在我们明白了:
- async/await是C#层面的语法糖。
- 等待一个异步方法和等待一个普通方法没有区别。任何返回Task的方法都是可等待的,或者更精确地说,Task对象是可等待的。什么是可等待的将在第2部分进行解释。
State machine and continuation
为了演示状态机中的更多细节,可以创建更复杂的方法:
internal class AsyncMethods
{
internal static async Task<int> MultiCallMethodAsync(int arg0, int arg1, int arg2, int arg3)
{
HelperMethods.Before();
int resultOfAwait1 = await MethodAsync(arg0, arg1);
HelperMethods.Continuation1(resultOfAwait1);
int resultOfAwait2 = await MethodAsync(arg2, arg3);
HelperMethods.Continuation2(resultOfAwait2);
int resultToReturn = resultOfAwait1 + resultOfAwait2;
return resultToReturn;
}
}
在这个方法中:
- 有多个await。
- 在每个await之前都有代码,在每个await之后都有continuation代码。
在编译后,这个多重await的方法会变成和上面的单await方法一样:
internal class CompiledAsyncMethods
{
[DebuggerStepThrough]
[AsyncStateMachine(typeof(MultiCallMethodAsyncStateMachine))] // async
internal static /*async*/ Task<int> MultiCallMethodAsync(int arg0, int arg1, int arg2, int arg3)
{
MultiCallMethodAsyncStateMachine multiCallMethodAsyncStateMachine = new MultiCallMethodAsyncStateMachine()
{
Arg0 = arg0,
Arg1 = arg1,
Arg2 = arg2,
Arg3 = arg3,
Builder = AsyncTaskMethodBuilder<int>.Create(),
State = -1
};
multiCallMethodAsyncStateMachine.Builder.Start(ref multiCallMethodAsyncStateMachine);
return multiCallMethodAsyncStateMachine.Builder.Task;
}
}
它也创建了一个状态机,MultiCallAsyncStateMachine,逻辑如下:
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
internal struct MultiCallMethodAsyncStateMachine : IAsyncStateMachine
{
public int State;
public AsyncTaskMethodBuilder<int> Builder;
public int Arg0;
public int Arg1;
public int Arg2;
public int Arg3;
public int ResultOfAwait1;
public int ResultOfAwait2;
public int ResultToReturn;
private TaskAwaiter<int> awaiter;
void IAsyncStateMachine.MoveNext()
{
try
{
switch (this.State)
{
case -1:
HelperMethods.Before();
this.awaiter = AsyncMethods.MethodAsync(this.Arg0, this.Arg1).GetAwaiter();
if (!this.awaiter.IsCompleted)
{
this.State = 0;
this.Builder.AwaitUnsafeOnCompleted(ref this.awaiter, ref this);
}
break;
case 0:
this.ResultOfAwait1 = this.awaiter.GetResult();
HelperMethods.Continuation1(this.ResultOfAwait1);
this.awaiter = AsyncMethods.MethodAsync(this.Arg2, this.Arg3).GetAwaiter();
if (!this.awaiter.IsCompleted)
{
this.State = 1;
this.Builder.AwaitUnsafeOnCompleted(ref this.awaiter, ref this);
}
break;
case 1:
this.ResultOfAwait2 = this.awaiter.GetResult();
HelperMethods.Continuation2(this.ResultOfAwait2);
this.ResultToReturn = this.ResultOfAwait1 + this.ResultOfAwait2;
this.State = -2;
this.Builder.SetResult(this.ResultToReturn);
break;
}
}
catch (Exception exception)
{
this.State = -2;
this.Builder.SetException(exception);
}
}
[DebuggerHidden]
void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine stateMachine)
{
this.Builder.SetStateMachine(stateMachine);
}
}
上面的代码已经整理过了,但还是有很多东西。为了遵循keep it simple的原则,状态机可以重写如下:
[CompilerGenerated]
[StructLayout(LayoutKind.Auto)]
internal struct MultiCallMethodAsyncStateMachine : IAsyncStateMachine
{
// State:
// -1: 状态机开始,在MultiCallMethodAsync中设置
// 0: 第一个await执行完了
// 1: 第二个await执行完了
// ...
// -2: 全部await完成
public int State;
public TaskCompletionSource<int> ResultToReturn; // int resultToReturn ...
public int Arg0; // int Arg0
public int Arg1; // int arg1
public int Arg2; // int arg2
public int Arg3; // int arg3
public int ResultOfAwait1; // int resultOfAwait1 ...
public int ResultOfAwait2; // int resultOfAwait2 ...
private Task<int> currentTaskToAwait;
public void MoveNext()
{
try
{
MultiCallMethodAsyncStateMachine that = this;
switch (this.State)
{
case -1: // -1 是状态机的开始
HelperMethods.Before(); // Code before 1st await.
that.currentTaskToAwait = AsyncMethods.MethodAsync(that.Arg0, that.Arg1); // 1st task to await
// currentTaskToAwait执行完之后,ContinueWith中再次调用MoveNext(),执行case 0。
that.State = 0;
that.currentTaskToAwait.ContinueWith(_ => { that.MoveNext(); }); // 回调
break;
case 0: // Now 1st await is done.
that.ResultOfAwait1 = that.currentTaskToAwait.Result; // Get 1st await's result.
HelperMethods.Continuation1(that.ResultOfAwait1); // Code after 1st await and before 2nd await.
that.currentTaskToAwait = AsyncMethods.MethodAsync(that.Arg2, that.Arg3); // 2nd task to await
// When this.currentTaskToAwait is done, run this.MoveNext() and go to case 1.
that.State = 1;
that.currentTaskToAwait.ContinueWith(_ => { that.MoveNext(); }); // Callback
break;
case 1: // Now 2nd await is done.
this.ResultOfAwait2 = this.currentTaskToAwait.Result; // Get 2nd await's result.
HelperMethods.Continuation2(this.ResultOfAwait2); // Code after 2nd await.
int resultToReturn = this.ResultOfAwait1 + this.ResultOfAwait2; // Code after 2nd await.
// 返回resultToReturn,不会再调用MoveNext()了。
this.State = -2; // -2 is end.
this.ResultToReturn.SetResult(resultToReturn);
break;
}
}
catch (Exception exception)
{
// End with exception.
this.State = -2; // -2 is end. Exception will also when the execution of state machine.
this.ResultToReturn.SetException(exception);
}
}
/// <summary>
/// Configures the state machine with a heap-allocated replica.
/// </summary>
/// <param name="stateMachine">The heap-allocated replica.</param>
[DebuggerHidden]
void SetStateMachine(IAsyncStateMachine stateMachine)
{
// No core logic.
}
}
在这个修改版本中,只涉及了 Task 和 TaskCompletionSource。并且 MultiCallMethodAsync() 也可以简化为:
[DebuggerStepThrough]
[AsyncStateMachine(typeof(MultiCallMethodAsyncStateMachine))] // async
internal static /*async*/ Task<int> MultiCallMethodAsync_(int arg0, int arg1, int arg2, int arg3)
{
MultiCallMethodAsyncStateMachine multiCallMethodAsyncStateMachine = new MultiCallMethodAsyncStateMachine()
{
Arg0 = arg0,
Arg1 = arg1,
Arg2 = arg2,
Arg3 = arg3,
ResultToReturn = new TaskCompletionSource<int>(),
// -1: Begin
// 0: 1st await is done
// 1: 2nd await is done
// ...
// -2: End
State = -1
};
(multiCallMethodAsyncStateMachine as IAsyncStateMachine).MoveNext(); // Original code are in this method.
return multiCallMethodAsyncStateMachine.ResultToReturn.Task;
}
现在整个状态机变得非常清晰 - 它是关于回调:
- 原始代码被“await”分割成多个片段,每个片段变成状态机中的一个“case”。这里的2个await将代码分割成了3个片段,因此有3个“case”。
- 这些代码片段通过回调链式执行,通过Builder.AwaitUnsafeOnCompleted(callback)或简化代码中的currentTaskToAwait.ContinueWith(callback)来实现回调。
- 前一个代码片段结束时会产生一个Task(用于await),当该Task完成时,它将回调下一个代码片段。
- 状态机的状态与“case”一起工作,以确保代码片段一个接一个顺序执行。
It is like callbacks
因为它类似回调,所以简化可以更进一步 - 整个状态机可以完全用Task.ContinueWith()来替换。那么MultiCallMethodAsync()会变成:
internal static Task<int> MultiCallMethodAsync(int arg0, int arg1, int arg2, int arg3)
{
TaskCompletionSource<int> taskCompletionSource = new TaskCompletionSource<int>();
try
{
HelperMethods.Before();
MethodAsync(arg0, arg1).ContinueWith(await1 =>
{
try
{
int resultOfAwait1 = await1.Result;
HelperMethods.Continuation1(resultOfAwait1);
MethodAsync(arg2, arg3).ContinueWith(await2 =>
{
try
{
int resultOfAwait2 = await2.Result;
HelperMethods.Continuation2(resultOfAwait2);
int resultToReturn = resultOfAwait1 + resultOfAwait2;
taskCompletionSource.SetResult(resultToReturn);
}
catch (Exception exception)
{
taskCompletionSource.SetException(exception);
}
});
}
catch (Exception exception)
{
taskCompletionSource.SetException(exception);
}
});
}
catch (Exception exception)
{
taskCompletionSource.SetException(exception);
}
return taskCompletionSource.Task;
}
为了与原始的async/await代码对比:
internal static async Task<int> MultiCallMethodAsync(int arg0, int arg1, int arg2, int arg3)
{
HelperMethods.Before();
int resultOfAwait1 = await MethodAsync(arg0, arg1);
HelperMethods.Continuation1(resultOfAwait1);
int resultOfAwait2 = await MethodAsync(arg2, arg3);
HelperMethods.Continuation2(resultOfAwait2);
int resultToReturn = resultOfAwait1 + resultOfAwait2;
return resultToReturn;
}
为了方便阅读,上面的代码可以转化成这种格式:
internal static Task<int> MultiCallMethodAsync(int arg0, int arg1, int arg2, int arg3)
{
TaskCompletionSource<int> taskCompletionSource = new TaskCompletionSource<int>(); try {
// Original code begins.
HelperMethods.Before();
// int resultOfAwait1 = await MethodAsync(arg0, arg1);
MethodAsync(arg0, arg1).ContinueWith(await1 => { try { int resultOfAwait1 = await1.Result;
HelperMethods.Continuation1(resultOfAwait1);
// int resultOfAwait2 = await MethodAsync(arg2, arg3);
MethodAsync(arg2, arg3).ContinueWith(await2 => { try { int resultOfAwait2 = await2.Result;
HelperMethods.Continuation2(resultOfAwait2);
int resultToReturn = resultOfAwait1 + resultOfAwait2;
// return resultToReturn;
taskCompletionSource.SetResult(resultToReturn);
// Original code ends.
} catch (Exception exception) { taskCompletionSource.SetException(exception); }});
} catch (Exception exception) { taskCompletionSource.SetException(exception); }});
} catch (Exception exception) { taskCompletionSource.SetException(exception); }
return taskCompletionSource.Task;
}
是的,这就是C# async/await的神奇之处:
- Await字面意思就是假装等待。在一个await表达式中,一个Task对象会立即返回,这样调用线程不会被阻塞。后续的代码被编译成那个Task的回调代码。
- 当那个任务完成时,后续代码将被执行。
然而,上面的回调代码在运行时存在一个上下文处理的问题,这将在part3中解释和修复。await只是假装等待,实际上是立即返回一个表示未完成任务的Task对象,并将后续代码编译为回调。当Task完成时,通过回调机制来执行continuation。
这种编译转换让异步代码可以以同步的流程表达,同时不阻塞调用线程。但是回调存在上下文问题需要处理,后面文章会详细解释。
Use Task.Yeild()
Task.Yield()是一个有趣的内置API:
You can use await Task.Yield(); in an asynchronous method to force the method to complete asynchronously.
您可以在异步方法中使用 await Task.Yield(); 来强制该方法异步完成。
举个例子:
internal static void NoYeild()
{
HelperMethods.Before();
HelperMethods.Continuation(0);
// Returns after HelperMethods.Continuation(0) finishes execution.
}
internal static async Task YeildAsync()
{
HelperMethods.Before();
await Task.Yield(); // Returns without waiting for continuation code to execute.
HelperMethods.Continuation(0);
}
在这里,await Task.Yield(); 表示编译器会将后面的 HelperMethods.Continuation(0);作为一个回调方法来编译。因此,同样地,它可以被重写成下面这样:
internal static Task YeildAsync()
{
TaskCompletionSource<object> taskCompletionSource = new TaskCompletionSource<object>();
try
{
HelperMethods.Before();
Task yeild = new Task(() => { });
yeild.Start();
yeild.ContinueWith(await =>
{
try
{
HelperMethods.Continuation(0);
taskCompletionSource.SetResult(null);
}
catch (Exception exception)
{
taskCompletionSource.SetException(exception);
}
});
}
catch (Exception exception)
{
taskCompletionSource.SetException(exception);
}
return taskCompletionSource.Task;
}
这里使用了TaskCompleteSource,这时因为.NET没有提供非泛型的TaskCompleteSource类。
同样的,上面也可以写成:
internal static Task YeildAsync()
{
TaskCompletionSource<object> taskCompletionSource = new TaskCompletionSource<object>(); try {
// Original code begins.
HelperMethods.Before();
// await Task.Yeild();
Task yeild = new Task(() => { }); yeild.Start(); yeild.ContinueWith(await => { try {
HelperMethods.Continuation(0);
// Original code ends.
taskCompletionSource.SetResult(null);
} catch (Exception exception) { taskCompletionSource.SetException(exception); }});
} catch (Exception exception) { taskCompletionSource.SetException(exception); }
return taskCompletionSource.Task;
}
换句话说,Task.Yield() 使得方法在这里立即返回,并且异步地将后续的代码安排给 CPU 执行,这为其他任务有机会先被 CPU 调度执行创造了机会。这与 JavaScript 中的 setTimeout() 方法有相似的概念:
var sync = function () {
before();
continuation();
// Returns after continuation finishes execution.
};
var async = function () {
before();
setTimeout(continuation, 0);
// Returns immediately (after setTimeout finishes execution).
};
除了JavaScript使用单线程模型之外。
再次说明,上面的ContinueWith()回调代码在运行时存在相同的上下文处理问题,我们将在第3部分中对此进行解释和修复。