在这篇文章中,我演示了许多想法和技术:
- 如何编写一个简单的非阻塞NIO客户端/服务器
- 协调遗漏的影响
- 如何测量百分位数的延迟(相对于简单平均)
- 如何在计算机上计时延迟回送
我最近正在为客户端服务器应用程序开发低延迟基准测试。 最初,我在使用环回TCP的单台计算机上模拟基准测试。 我要量化的第一个指标是允许简单的环回延迟需要多少记录延迟。 这样,我便可以更清楚地了解实际应用程序增加的延迟。
为此,我创建了一个程序(文章末尾的代码),该程序将一个字节从客户端传输到服务器,然后再传输回来。 重复执行此操作并处理结果。
该程序是使用非阻塞Java NIO编写的,以尽可能优化回送延迟。
比记录平均时间更重要的是,记录了百分比延迟。 (有关如何测量延迟的讨论,请参见此处的上一篇文章)。 至关重要的是,协调遗漏的代码因素。 (要了解更多关于此看到这里从吉尔·特内)。 简而言之,您不会从开始时就开始计时,而应该从开始就开始。
这些是我2岁的MBP的结果。
Starting latency test rate: 80000
Average time 2513852
Loop back echo latency was 2514247.3/3887258.6 4,196,487/4,226,913 4,229,987/4230294 4,230,294 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 70000
Average time 2327041
Loop back echo latency was 2339701.6/3666542.5 3,957,860/3,986,626 3,989,404/3989763 3,989,763 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 50000
Average time 1883303
Loop back echo latency was 1881621.0/2960104.0 3,203,771/3,229,260 3,231,809/3232046 3,232,046 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 30000
Average time 1021576
Loop back echo latency was 1029566.5/1599881.0 1,726,326/1,739,626 1,741,098/1741233 1,741,233 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 20000
Average time 304
Loop back echo latency was 65.6/831.2 3,632/4,559 4,690/4698 4,698 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 10000
Average time 50
Loop back echo latency was 47.8/57.9 89/120 152/182 182 us for 50/90 99/99.9 99.99/99.999 worst %tile
将这些结果与我没有纠正遗漏的情况进行比较:
Starting latency test rate: 80000
Average time 45
Loop back echo latency was 44.1/48.8 71/105 124/374 374 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 70000
Average time 45
Loop back echo latency was 44.1/48.9 76/106 145/358 358 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 50000
Average time 45
Loop back echo latency was 43.9/48.8 74/105 123/162 162 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 30000
Average time 45
Loop back echo latency was 44.0/48.8 73/104 129/147 147 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 20000
Average time 45
Loop back echo latency was 44.7/49.6 78/107 135/311 311 us for 50/90 99/99.9 99.99/99.999 worst %tile
Starting latency test rate: 10000
Average time 46
Loop back echo latency was 45.1/50.8 81/112 144/184 184 us for 50/90 99/99.9 99.99/99.999 worst %tile
如您所见,吞吐量的影响被完全忽略了! 看起来,即使以每秒80,000条消息的速度运行,您的99.99个百分位数仍为374us,而实际上却远大于此。
实际上,只有在吞吐量接近每秒10,000时,您才能达到目标延迟。 如您所直观地理解的那样,在吞吐量和延迟之间需要权衡取舍。
该测试的代码如下:
package util;
import java.io.EOFException;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.net.Socket;
import java.nio.ByteBuffer;
import java.nio.channels.ServerSocketChannel;
import java.nio.channels.SocketChannel;
import java.util.Arrays;
/**
* Created by daniel on 02/07/2015.
* Simple program to test loopback speeds and latencies.
*/
public class LoopBackPingPong {
public final static int PORT = 8007;
public void runServer(int port) throws IOException {
ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.bind(new InetSocketAddress(port));
System.out.println("listening on " + ssc);
final SocketChannel socket = ssc.accept();
socket.socket().setTcpNoDelay(true);
socket.configureBlocking(false);
new Thread(() -> {
long totalTime = 0;
int count = 0;
try {
System.out.println("Connected " + socket);
ByteBuffer bb = ByteBuffer.allocateDirect(1);
int length;
while ((length = socket.read(bb)) >= 0) {
if (length > 0) {
long time = System.nanoTime();
bb.flip();
bb.position(0);
count++;
if (socket.write(bb) < 0)
throw new EOFException();
bb.clear();
totalTime += System.nanoTime() - time;
}
}
} catch (IOException ignored) {
} finally {
System.out.println("Total server time " + (totalTime / count) / 1000);
System.out.println("... disconnected " + socket);
try {
socket.close();
} catch (IOException ignored) {
}
}
}).start();
}
public void testLatency(int targetThroughput, SocketChannel socket) throws IOException {
System.out.println("Starting latency test rate: " + targetThroughput);
int tests = Math.min(18 * targetThroughput, 100_000);
long[] times = new long[tests];
int count = 0;
long now = System.nanoTime();
long rate = (long) (1e9 / targetThroughput);
ByteBuffer bb = ByteBuffer.allocateDirect(4);
bb.putInt(0, 0x12345678);
for (int i = -20000; i < tests; i++) {
//now += rate;
//while (System.nanoTime() < now)
// ;
now = System.nanoTime();
bb.position(0);
while (bb.remaining() > 0)
if (socket.write(bb) < 0)
throw new EOFException();
bb.position(0);
while (bb.remaining() > 0)
if (socket.read(bb) < 0)
throw new EOFException();
if (bb.getInt(0) != 0x12345678)
throw new AssertionError("read error");
if (i >= 0)
times[count++] = System.nanoTime() - now;
}
System.out.println("Average time " + (Arrays.stream(times).sum() / times.length) / 1000);
Arrays.sort(times);
System.out.printf("Loop back echo latency was %.1f/%.1f %,d/%,d %,d/%d %,d us for 50/90 99/99.9 99.99/99.999 worst %%tile%n",
times[times.length / 2] / 1e3,
times[times.length * 9 / 10] / 1e3,
times[times.length - times.length / 100] / 1000,
times[times.length - times.length / 1000] / 1000,
times[times.length - times.length / 10000] / 1000,
times[times.length - times.length / 100000] / 1000,
times[times.length - 1] / 1000
);
}
public static void main(String... args) throws Exception {
int port = args.length < 1 ? PORT : Integer.parseInt(args[0]);
LoopBackPingPong lbpp = new LoopBackPingPong();
new Thread(() -> {
try {
lbpp.runServer(port);
} catch (IOException e) {
Jvm.rethrow(e);
}
}).start();
//give the server a chance to start
Thread.sleep(1000);
SocketChannel socket = SocketChannel.open(new InetSocketAddress("localhost", port));
socket.socket().setTcpNoDelay(true);
socket.configureBlocking(false);
for (int i : new int[]{80_000, 70_000, 50_000, 30_000, 20_000, 10_000})
lbpp.testLatency(i, socket);
System.exit(0);
}
}