分别使用下面四种buffer测试
ByteBufoutBuf=UnpooledByteBufAllocator.DEFAULT.heapBuffer();
//ByteBufoutBuf=PooledByteBufAllocator.DEFAULT.heapBuffer();
//ByteBufoutBuf=PooledByteBufAllocator.DEFAULT.buffer();
//ByteBufoutBuf=UnpooledByteBufAllocator.DEFAULT.buffer();
outBuf=outBuf.order(ByteOrder.LITTLE_ENDIAN);
outBuf.writeInt(17);
outBuf.writeInt(112);
outBuf.writeShort(11);
outBuf.writeShort(5);
outBuf.writeBytes(bytes);
ctx.channel().writeAndFlush(outBuf);
UnpooledByteBufAllocator.DEFAULT.heapBuffer()
吞吐量:30144/秒
PooledByteBufAllocator.DEFAULT.heapBuffer()
吞吐量:30327/秒
PooledByteBufAllocator.DEFAULT.buffer()
吞吐量:32082/秒
UnpooledByteBufAllocator.DEFAULT.buffer()
吞吐量:32292/秒
可以发现PooledByteBufAllocator.DEFAULT.heapBuffer()情况使用的堆内存最多,为72.7MB,按理说,使用内存池可重用内存,理论上是最少的,看细细研究是这样的结论:
在启动程序时,设置了四个线程,
final DefaultEventExecutorGroupexecutorGroup = new DefaultEventExecutorGroup(4, new DefaultThreadFactory("decode-worker-thread-pool"));
netty避免每个线程对内存池的竞争,在每个线程都提供了PoolThreadCache线程内的内存池,
final classPoolThreadCache {
finalPoolArena<byte[]> heapArena;
finalPoolArena<ByteBuffer>directArena;
// TODO: Test if adding padding helps undercontention
//private long pad0, pad1, pad2, pad3,pad4, pad5, pad6, pad7;
PoolThreadCache(PoolArena<byte[]> heapArena, PoolArena<ByteBuffer>directArena) {
this.heapArena = heapArena;
this.directArena = directArena;
}
}
PoolArena里的PoolChunk初始化大小16MB,有四个线程,于是看到的就如图