长链接发送request/response时, 绝大部分包都是小包, 而每个小包都要消耗一个IP包, 成本大约是20-30us, 普通千兆网卡的pps大约是60Wpps, 所以想要提高长链接密集IO的应用性能, 需要做包的合并, 也称为了scatter/gather io或者vector io.
在linux下有readv/writev就是对应这个需求的, 减少系统调用, 减少pps, 提高网卡的吞吐量. 关于readv提高读的速度, 可以看看陈硕muduo里面对于readv的使用, 思路是就是在栈上面弄一个64KB的数组, 组成readv的第二块buffer, 从而尽可能一次性把socket缓冲区的内容全部出来(参见5). 这里不再赘述, 重点描述DotNetty下面怎么做Gathering Write.
首先得有一个Channel, 用来做写的缓冲, 让业务关心业务, 网络关心网络, 否则每个业务都WriteAndFlushAsync, 那是不太可能有合并发送的.
然后就是SendingLoop的主循环, 里面不停的从Channel里面TryRead包, 然后WriteAsync, 隔几个包Flush一次. 类似的思想在Orleans Network里面也存在.
1 public voidRunSendLoopAsync(IChannel channel)2 {3 var allocator =channel.Allocator;4 var reader = this.queue.Reader;5 Task.Run(async () =>
6 {7 while (!this.stop)8 {9 var more = awaitreader.WaitToReadAsync();10 if (!more)11 {12 break;13 }14
15 IOutboundMessage message = default;16 var number = 0;17 try
18 {19 while (number < 4 && reader.TryRead(outmessage))20 {21 Interlocked.Decrement(ref this.queueCount);22 var msg = message.Inner asIMessage;23 var buffer =msg.ToByteBuffer(allocator);24 awaitchannel.WriteAsync(buffer);25 number++;26 }27 channel.Flush();28 number = 0;29 }30 catch (Exception e) when(message != default)31 {32 logger.LogError("SendOutboundMessage Fail, SessionID:{0}, Exception:{1}",33 this.sessionID, e.Message);34 this.messageCenter.OnMessageFail(message);35 }36 }37 this.logger.LogInformation("SessionID:{0}, SendingLoop Exit", this.sessionID);38 });39 }
第19-27行是关键, 这边每4个包做一下flush, 然后flush会触发DotNetty的DoWrite:
1 protected override voidDoWrite(ChannelOutboundBuffer input)2 {3 List> sharedBufferList = null;4 try
5 {6 while (true)7 {8 int size =input.Size;9 if (size == 0)10 {11 //All written
12 break;13 }14 long writtenBytes = 0;15 bool done = false;16
17 //Ensure the pending writes are made of ByteBufs only.
18 int maxBytesPerGatheringWrite = ((TcpSocketChannelConfig)this.config).GetMaxBytesPerGatheringWrite();19 sharedBufferList = input.GetSharedBufferList(1024, maxBytesPerGatheringWrite);20 int nioBufferCnt =sharedBufferList.Count;21 long expectedWrittenBytes =input.NioBufferSize;22 Socket socket = this.Socket;23
24 List> bufferList =sharedBufferList;25 //Always us nioBuffers() to workaround data-corruption.26 //Seehttps://github.com/netty/netty/issues/2761
27 switch(nioBufferCnt)28 {29 case 0:30 //We have something else beside ByteBuffers to write so fallback to normal writes.
31 base.DoWrite(input);32 return;33 default:34 for (int i = this.Configuration.WriteSpinCount - 1; i >= 0; i--)35 {36 long localWrittenBytes = socket.Send(bufferList, SocketFlags.None, outSocketError errorCode);37 if (errorCode != SocketError.Success && errorCode !=SocketError.WouldBlock)38 {39 throw new SocketException((int)errorCode);40 }
DotNetty TcpSocketChannel类的DoWrite函数, 19行获取当前ChannelOutboundBuffer的Segment数组, 然后在36行调用Socket.Send一次性发出去, 这个是Gathering Write的关键. 有了这个, 就可以不在业务层用CompositeByteBuffer.
实际上Orleans 3.x做的网络优化, 也有类似的思想:
1 private asyncTask ProcessOutgoing()2 {3 awaitTask.Yield();4
5 Exception error = default;6 PipeWriter output = default;7 var serializer = this.serviceProvider.GetRequiredService();8 try
9 {10 output = this.Context.Transport.Output;11 var reader = this.outgoingMessages.Reader;12 if (this.Log.IsEnabled(LogLevel.Information))13 {14 this.Log.LogInformation(15 "Starting to process messages from local endpoint {Local} to remote endpoint {Remote}",16 this.LocalEndPoint,17 this.RemoteEndPoint);18 }19
20 while (true)21 {22 var more = awaitreader.WaitToReadAsync();23 if (!more)24 {25 break;26 }27
28 Message message = default;29 try
30 {31 while (inflight.Count < inflight.Capacity && reader.TryRead(out message) && this.PrepareMessageForSend(message))32 {33 inflight.Add(message);34 var (headerLength, bodyLength) = serializer.Write(refoutput, message);35 MessagingStatisticsGroup.OnMessageSend(this.MessageSentCounter, message, headerLength + bodyLength, headerLength, this.ConnectionDirection);36 }37 }38 catch (Exception exception) when (message != default)39 {40 this.OnMessageSerializationFailure(message, exception);41 }42
43 var flushResult = awaitoutput.FlushAsync();44 if (flushResult.IsCompleted ||flushResult.IsCanceled)45 {46 break;47 }48
49 inflight.Clear();50 }
核心在31行, 开始写, 43行开始flush, 只不过Orleans用的pipelines io, DotNetty是传统模型.
这样做, 可以在有限的pps下, 支撑更高的吞吐量.
个人感觉DotNetty更好用一些.
参考:
1. https://github.com/Azure/DotNetty/blob/dev/src/DotNetty.Transport/Channels/Sockets/TcpSocketChannel.cs#L271-L288
2. https://github.com/dotnet/orleans/blob/master/src/Orleans.Core/Networking/Connection.cs#L282-L294
3. https://docs.microsoft.com/zh-cn/windows/win32/winsock/scatter-gather-i-o-2
4. https://linux.die.net/man/2/writev
5. https://github.com/chenshuo/muduo/blob/d980315dc054b122612f423ee2e1316cb14bd3b5/muduo/net/Buffer.cc#L28-38
文章来源: www.cnblogs.com,作者:egmkang,版权归原作者所有,如需转载,请联系作者。
原文链接:https://www.cnblogs.com/egmkang/p/DotNetty-Gathering-Write.html