一波笔记,可能有些粗糙,之后逐步学习逐步完善
目录
6、PooledHeapByteBuf和PooledDirectByteBuf
一、DMA(Direct Memory Access)
二、零拷贝 (Zero-Copy)
零拷贝是针对CPU而言,为了使CPU在io过程中尽量少的进行数据复制操作而产生的技术。
对于操作系统来说,首先来看场景,将磁盘上某文件读取并发送到网络,传统IO(read/write)需要经过以下操作:
- 用户程序发起读文件请求,cpu从用户态切换为内核态,启动DMA执行数据传送,由DMA复制数据至内核缓冲区
- DMA读取完毕,通知CPU中断,CPU从内核缓冲区将数据复制到应用缓冲区,并从内核态切换回用户态
- 接着用户程序发起socket.send(), CPU切换用户态到内核态,将应用缓冲区数据再次复制到内核socket缓冲区,发起DMA处理,之后返回结果,并从内核态切换回用户态
- 由DMA将内核socket缓冲区数据复制到网卡缓冲区,由socket发送出去
这种方式共进行了4次上下文切换和4次复制操作,其中2次CPU复制,2次DMA复制。
linux提供的优化方案:
1、mmap
将应用缓冲区和内核缓冲区进行映射,数据无需复制到应用缓冲区,应用空间持有内核缓冲区空间的引用,可以直接操作,那么示例场景第二步和第三步将被优化为,CPU直接在两个内核缓冲区间进行复制。
这种方式则进行了4次上下文切换和3次复制操作,其中1次CPU复制,2次DMA复制
2、sendfile
示例场景对数据并没有进行操作,原封不动发送出去,那么并不需要将文件数据读到程序中,直接发起传送请求,在linux2.1版本中操作流程就变成了:
- 用户程序发起文件传送请求,cpu从用户态切换为内核态,启动DMA执行数据传送,由DMA复制数据至内核缓冲区
- DMA传送完毕,通知CPU中断,CPU再将数据复制到socket缓冲区,再启动DMA,并切换会用户态返回结果
- DMA将socket缓冲区数据复制到网卡缓冲区
这种方式则进行了两次上下文切换和3次复制,其中1次CPU复制,2次DMA复制。
继续优化,在内核中进行的这一次复制没有必要,在linux2.4中优化了这一步,在第二步时,没有复制数据,而是将关于数据的位置和长度的信息的描述符被追加到了socket buffer 缓冲区中,第三步DMA直接根据描述符将内核缓冲区的数据复制到网卡缓冲区。至此这种方式将CPU完全解放,实现“零拷贝”。当数据需要进行操作后再传送就不能使用这种方式了。
3、广义零拷贝
上述2中实现的零拷贝,是对与操作系统而言真正意义的零拷贝,但是广义零拷贝概念,是操作中尽量减少对数据进行不必要的拷贝,能减少就算零拷贝
三、JAVA中的零拷贝
1、MappedByteBuffer
java中提供的基于内存映射mmap的零拷贝方式,NIO中的堆外缓冲DirectByteBuffer就是它的实现类
2、FileChannel
transferFrom() 和 transferTo() 两个抽象方法,底层为sendfile,map方法底层mmap,返回MappedByteBuffer
四、ByteBuffer(JAVA)
public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer>
先看Buffer父类有什么属性
1、Buffer
public abstract class Buffer {
// Invariants: mark <= position <= limit <= capacity
// 一个备用字段,暂存position的值时使用,mark()方法暂存position, reset()方法再换回来
private int mark = -1;
// 当前读或写到的索引位置
private int position = 0;
// 最多能读或写的索引位置,例如容量为512,limit设置为10,那么只能读或写前10个位置
private int limit;
// 容量,能够容纳的数据元素的最大数量,构造后不能改变
private int capacity;
// Used only by direct buffers 只被直接缓冲区使用,mmap
// NOTE: hoisted here for speed in JNI GetDirectBufferAddress
long address;
Buffer(int mark, int pos, int lim, int cap) {// 包内可见package-private
if (cap < 0)
throw new IllegalArgumentException("Negative capacity: " + cap);
this.capacity = cap;
limit(lim);
position(pos);
if (mark >= 0) {
if (mark > pos)
throw new IllegalArgumentException("mark > position: ("
+ mark + " > " + pos + ")");
this.mark = mark;
}
}
// 写模式转读模式
public final Buffer flip() {
limit = position;
position = 0;
mark = -1;
return this;
}
}
Buffer子类有ByteBuffer、CharBuffer、DoubleBuffer、FloatBuffer、IntBuffer、LongBuffer 和 ShortBuffer,基本数据类型除了boolean都有对应的缓冲区实现,本次就先看ByteBuffer
2、ByteBuffer
public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer>
{
// Non-null only for heap buffers
// 仅堆内缓冲区实现类,hb不为空
final byte[] hb;
final int offset;
boolean isReadOnly; // Valid only for heap buffers
ByteBuffer(int mark, int pos, int lim, int cap) { // package-private
this(mark, pos, lim, cap, null, 0);
}
ByteBuffer(int mark, int pos, int lim, int cap, // package-private
byte[] hb, int offset)
{
super(mark, pos, lim, cap);
this.hb = hb;
this.offset = offset;
}
// 分配堆外内存缓冲区
public static ByteBuffer allocateDirect(int capacity) {
return new DirectByteBuffer(capacity);
}
// 分配堆内缓冲区
public static ByteBuffer allocate(int capacity) {
if (capacity < 0)
throw new IllegalArgumentException();
return new HeapByteBuffer(capacity, capacity);
}
// 将数组转为buffer
public static ByteBuffer wrap(byte[] array) {
return wrap(array, 0, array.length);
}
public static ByteBuffer wrap(byte[] array, int offset, int length)
{
try {
return new HeapByteBuffer(array, offset, length);
} catch (IllegalArgumentException x) {
throw new IndexOutOfBoundsException();
}
}
// 创建一个新缓冲区,容量为当前buffer剩余的容量,当前是direct,新buffer就是direct,当前是readOnly,新的也一样
// 源码注释:Creates a new byte buffer whose content is a shared subsequence of this buffer's content.
// 创建的新buffer是本buffer的共享子序列,就HeapByteBuffer来说,底层byte[]数组是同一个对象,但是position、limit等不同
public abstract ByteBuffer slice();
// 源码注释:Creates a new byte buffer that shares this buffer's content.
// 创建一个新的buffer,position、limit等属性独立,但是共享byte[]数组对象
public abstract ByteBuffer duplicate();
// 获取当前position位置的byte,最基本的get方法,堆内实现取得底层数组对应值,堆外实现unsafe类获取内存地址中值
public abstract byte get();
// 获取给定index的byte,
public abstract byte get(int index);
public ByteBuffer get(byte[] dst) {
return get(dst, 0, dst.length);
}
// 读buffer,读出的数据放到dst数组的 offset--length段
public ByteBuffer get(byte[] dst, int offset, int length) {
checkBounds(offset, length, dst.length);
if (length > remaining())
throw new BufferUnderflowException();
int end = offset + length;
for (int i = offset; i < end; i++)
dst[i] = get();
return this;
}
// 往当前position写入byte
public abstract ByteBuffer put(byte b);
// 往给定index写入byte
public abstract ByteBuffer put(int index, byte b);
public final ByteBuffer put(byte[] src) {
return put(src, 0, src.length);
}
// 把src的 offset-length这一段写入本buffer
public ByteBuffer put(byte[] src, int offset, int length) {
checkBounds(offset, length, src.length);
if (length > remaining())
throw new BufferOverflowException();
int end = offset + length;
for (int i = offset; i < end; i++)
this.put(src[i]);
return this;
}
}
3、HeapByteBuffer
内部就是ByteBuffer中的byte[]数组存放数据
class HeapByteBuffer extends ByteBuffer
{
// 初始化底层byte[]数组
HeapByteBuffer(int cap, int lim) { // package-private
super(-1, 0, lim, cap, new byte[cap], 0);
}
// 上文说的两个方法的实现,数组对象不变,position,limit,mark属性重新设置
public ByteBuffer slice() {
return new HeapByteBuffer(hb,
-1,
0,
this.remaining(),
this.remaining(),
this.position() + offset);
}
public ByteBuffer duplicate() {
return new HeapByteBuffer(hb,
this.markValue(),
this.position(),
this.limit(),
this.capacity(),
offset);
}
}
4、MappedByteBuffer
public abstract class MappedByteBuffer extends ByteBuffer
{
// 文件描述符,简单理解成是个非负整数的索引值,指向内核为每一个进程所维护的该进程打开文件的记录表
private final FileDescriptor fd;
// This should only be invoked by the DirectByteBuffer constructors
//
MappedByteBuffer(int mark, int pos, int lim, int cap, // package-private
FileDescriptor fd)
{
super(mark, pos, lim, cap);
this.fd = fd;
}
}
5、DirectByteBuffer
public interface DirectBuffer {
long address();
Object attachment();
// 堆外内存用这个来释放空间
Cleaner cleaner();
}
class DirectByteBuffer extends MappedByteBuffer implements DirectBuffer
{
DirectByteBuffer(int cap) { // package-private
super(-1, 0, cap, cap);
boolean pa = VM.isDirectMemoryPageAligned();
int ps = Bits.pageSize();
long size = Math.max(1L, (long)cap + (pa ? ps : 0));
Bits.reserveMemory(size, cap);
long base = 0;
try {
// unsafe分配内存,C.alloc()
base = unsafe.allocateMemory(size);
} catch (OutOfMemoryError x) {
Bits.unreserveMemory(size, cap);
throw x;
}
unsafe.setMemory(base, size, (byte) 0);
if (pa && (base % ps != 0)) {
// Round up to page boundary
address = base + ps - (base & (ps - 1));
} else {
address = base;
}
// directBuffer用cleaner来清理内存
cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
att = null;
}
// 获取当前位置byte,unsafe为Unsafe.getUnsafe(),调用native方法获取
public byte get() {
return ((unsafe.getByte(ix(nextGetIndex()))));
}
public byte get(int i) {
return ((unsafe.getByte(ix(checkIndex(i)))));
}
}
Cleaner.create(this, new Deallocator(base, size, cap));
1)new Deallocator(base, size, cap)
Deallocator对象是一个Runnable,用来调用unsafe释放内存
private static class Deallocator
implements Runnable
{
private static Unsafe unsafe = Unsafe.getUnsafe();
private long address;
private long size;
private int capacity;
private Deallocator(long address, long size, int capacity) {
assert (address != 0);
this.address = address;
this.size = size;
this.capacity = capacity;
}
public void run() {
if (address == 0) {
// Paranoia
return;
}
unsafe.freeMemory(address);
address = 0;
Bits.unreserveMemory(size, capacity);
}
}
2)Cleaner.create
public class Cleaner extends PhantomReference<Object> {
//
private static final ReferenceQueue<Object> dummyQueue = new ReferenceQueue();
// 静态链表变量,每个cleaner都会被加入到链表头
private static Cleaner first = null;
// 双向链表
private Cleaner next = null;
private Cleaner prev = null;
// 清理操作的任务,也就是Deallocator对象
private final Runnable thunk;
public static Cleaner create(Object var0, Runnable var1) {
// new一个Cleaner(),并add到链表头
return var1 == null ? null : add(new Cleaner(var0, var1));
}
private Cleaner(Object var1, Runnable var2) {
super(var1, dummyQueue);
this.thunk = var2;
}
// add到cleaner链表头
private static synchronized Cleaner add(Cleaner var0) {
if (first != null) {
var0.next = first;
first.prev = var0;
}
first = var0;
return var0;
}
// 清理时调用的方法
public void clean() {
if (remove(this)) {
try {
this.thunk.run();
} catch (final Throwable var2) {
//。。。
}
}
}
}
五、ByteBuf(Netty)
public abstract class ByteBuf implements ReferenceCounted, Comparable<ByteBuf> {
// 基本上所有的方法都是abstract
// 定义了一些基础的操作方法,几乎和ByteBuffer里边差不多的意思
}
1、ReferenceCounted
// 引用计数接口,引用计数为0然后释放空间,gc中类似
public interface ReferenceCounted {
// 当前引用数
int refCnt();
// 引用数加一
ReferenceCounted retain();
// 引用数减一,为0时释放空间
boolean release();
}
2、ByteBufAllocator
实例化ByteBuf的工具类接口,定义了几个核心方法
// 只列了部分
public interface ByteBufAllocator {
// 默认实现,内部static 方法中根据系统参数io.netty.allocator.type实例化了一个ByteBufAllocator
// UnpooledByteBufAllocator、PooledByteBufAllocator,池化和非池化
ByteBufAllocator DEFAULT = ByteBufUtil.DEFAULT_ALLOCATOR;
// 返回一个入参大小的buffer,至于是direct还是heap,由实现类决定
ByteBuf buffer(int initialCapacity);
// 返回一个入参大小的buffer,最好是适合io的直接缓冲区direct
ByteBuf ioBuffer(int initialCapacity);
// 返回堆内buffer
ByteBuf heapBuffer(int initialCapacity);
// 返回堆外nuffer
ByteBuf directBuffer(int initialCapacity);
// CompositeByteBuf 是个ByteBuf的组合,底层ByteBuf[]
CompositeByteBuf compositeBuffer(int maxNumComponents);
CompositeByteBuf compositeHeapBuffer(int maxNumComponents);
CompositeByteBuf compositeDirectBuffer(int maxNumComponents);
}
3、AbstractByteBufAllocator
核心方法newHeapBuffer,,newDirectBuffer都是abstract, 接下来就关注PooledByteBufAllocator和UnpooledByteBufAllocator怎么实现
public abstract class AbstractByteBufAllocator implements ByteBufAllocator {
static final int DEFAULT_INITIAL_CAPACITY = 256;
static final int DEFAULT_MAX_CAPACITY = Integer.MAX_VALUE;
static final int DEFAULT_MAX_COMPONENTS = 16;
static final int CALCULATE_THRESHOLD = 1048576 * 4; // 4 MiB page
private final boolean directByDefault;
private final ByteBuf emptyBuf;
// 入参数为true表示当前系统首选buffer类型是direct类型
// 首选direct && 当前系统支持unsafe操作,那么buffer方法返回的类型就是direct
protected AbstractByteBufAllocator(boolean preferDirect) {
directByDefault = preferDirect && PlatformDependent.hasUnsafe();
emptyBuf = new EmptyByteBuf(this);
}
// 以下几个方法,还有些默认参数的重载方法
@Override
public ByteBuf buffer(int initialCapacity, int maxCapacity) {
if (directByDefault) {
return directBuffer(initialCapacity, maxCapacity);
}
return heapBuffer(initialCapacity, maxCapacity);
}
@Override
public ByteBuf ioBuffer(int initialCapacity, int maxCapacity) {
if (PlatformDependent.hasUnsafe()) {
return directBuffer(initialCapacity, maxCapacity);
}
return heapBuffer(initialCapacity, maxCapacity);
}
@Override
public ByteBuf heapBuffer(int initialCapacity, int maxCapacity) {
if (initialCapacity == 0 && maxCapacity == 0) {
return emptyBuf;
}
validate(initialCapacity, maxCapacity);
return newHeapBuffer(initialCapacity, maxCapacity);
}
@Override
public ByteBuf directBuffer(int initialCapacity, int maxCapacity) {
if (initialCapacity == 0 && maxCapacity == 0) {
return emptyBuf;
}
validate(initialCapacity, maxCapacity);
return newDirectBuffer(initialCapacity, maxCapacity);
}
}
4、UnpooledByteBufAllocator
以下的几种实现类,heap和direct两大种,每种又分了unsafe和非unsafe。
判断如果当前操作系统支持unsafe操作,就返回unsafe类型的实现类,底层getByte(index)方法就是用unsafe.getByte(byte[] , index)来操作。
heap的都继承UnpooledHeapByteBuf,unsafe类型的底层获取getByte()实现调用unsafe类实现,非unsafe类直接返回byte[index]。
direct的UnpooledDirectByteBuf和UnpooledUnsafeDirectByteBuf倒是没有继承关系,底层都是ByteBuffer.allocateDirect(),getByte()也都是调用unsafe(native)来实现,我还真没看出区别。。道行不够
至于Instrumented(仪表化),仅仅是有个属性来记录目前Allocator在堆上或堆外已经分配了多少的空间,分配数组和释放数组空间时更新了一下这个属性。
@Override
protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
return PlatformDependent.hasUnsafe() ?
new InstrumentedUnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) :
new InstrumentedUnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
}
@Override
protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
final ByteBuf buf;
if (PlatformDependent.hasUnsafe()) {
buf = noCleaner ? new InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf(this, initialCapacity, maxCapacity) :
new InstrumentedUnpooledUnsafeDirectByteBuf(this, initialCapacity, maxCapacity);
} else {
buf = new InstrumentedUnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
}
return disableLeakDetector ? buf : toLeakAwareBuffer(buf);
}
4.1、UnpooledHeapByteBuf
public class UnpooledHeapByteBuf extends AbstractReferenceCountedByteBuf {
private final ByteBufAllocator alloc;
byte[] array;
private ByteBuffer tmpNioBuf;
public UnpooledHeapByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
super(maxCapacity);
this.alloc = alloc;
setArray(allocateArray(initialCapacity));
setIndex(0, 0);
}
// 底层数组对象
protected byte[] allocateArray(int initialCapacity) {
return new byte[initialCapacity];
}
@Override
protected byte _getByte(int index) {
return HeapByteBufUtil.getByte(array, index);
}
// 释放buffer空间底层方法
@Override
protected void deallocate() {
// 空实现
freeArray(array);
// 数组对象置为空 {}
array = EmptyArrays.EMPTY_BYTES;
}
}
final class HeapByteBufUtil {
static byte getByte(byte[] memory, int index) {
return memory[index];
}
}
4.2、UnpooledDirectByteBuf
public class UnpooledDirectByteBuf extends AbstractReferenceCountedByteBuf {
private ByteBuffer buffer;
public UnpooledDirectByteBuf(ByteBufAllocator alloc, int initialCapacity, int maxCapacity) {
super(maxCapacity);
this.alloc = alloc;
setByteBuffer(allocateDirect(initialCapacity));
}
protected ByteBuffer allocateDirect(int initialCapacity) {
return ByteBuffer.allocateDirect(initialCapacity);
}
@Override
protected byte _getByte(int index) {
// buffer是DirectByteBuffer,底层get为unsafe native实现
return buffer.get(index);
}
}
5、PooledByteBufAllocator
- PooledByteBufAllocator对象中有两种PoolArena数组,每个PoolArena都是池化的buffer空间,一种heap的一种direct的。
- Allocator对象为每个线程分配一组PoolArena放到ThreadLocal变量中,在newHeapBuffer/newDirectBuffer方法中,获取并调用PoolArena对象来进行分配Buf(PooledDirectByteBuf/PooledHeapByteBuf)
- 至于PoolArena池的结构、如何分配那是另一层,这波不关注下次再详细看
public class PooledByteBufAllocator extends AbstractByteBufAllocator implements ByteBufAllocatorMetricProvider {
private final PoolThreadLocalCache threadCache = new PoolThreadLocalCache(useCacheForAllThreads);
@Override
protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
// threadCache,简单理解为ThreadLocal变量,每个线程对应一个池空间
PoolThreadCache cache = threadCache.get();
// 获取池空间的堆内池HeapArena
PoolArena<byte[]> heapArena = cache.heapArena;
final ByteBuf buf;
if (heapArena != null) {
// 池不为空,调用池获取新的 buf
buf = heapArena.allocate(cache, initialCapacity, maxCapacity);
} else {
// 否则就是unpool类型的buf
buf = PlatformDependent.hasUnsafe() ?
new UnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) :
new UnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
}
return toLeakAwareBuffer(buf);
}
@Override
protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
PoolThreadCache cache = threadCache.get();
// 获取堆外内存池DirectArena
PoolArena<ByteBuffer> directArena = cache.directArena;
final ByteBuf buf;
if (directArena != null) {
buf = directArena.allocate(cache, initialCapacity, maxCapacity);
} else {
buf = PlatformDependent.hasUnsafe() ?
UnsafeByteBufUtil.newUnsafeDirectByteBuf(this, initialCapacity, maxCapacity) :
new UnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
}
return toLeakAwareBuffer(buf);
}
}
6、PooledHeapByteBuf和PooledDirectByteBuf
分配从PoolArena池中获取buf空间,释放再释放回池等待下次重用
//io.netty.buffer.PoolArena#allocate(io.netty.buffer.PoolThreadCache, int, int)
PooledByteBuf<T> allocate(PoolThreadCache cache, int reqCapacity, int maxCapacity) {
// 从当前线程栈上获取可重用的PooledByteBuf对象
PooledByteBuf<T> buf = newByteBuf(maxCapacity);
// 根据申请的大小重新分配空间
allocate(cache, buf, reqCapacity);
return buf;
}
这个一时半会研究不全面,得细琢磨琢磨,下一章准备看看池的分配回收代码
暂时只是稍微理解下,小测试代码:
public static void main(String[] args) {
// 默认是使用direct类型实现类,这里设置为true,表示使用heap类型
System.setProperty("io.netty.noPreferDirect", "true");
// 默认是PoolByteBufAllocator
ByteBufAllocator byalloc = ByteBufAllocator.DEFAULT;
// 新建一个16字节的buffer
ByteBuf buffer = byalloc.buffer(16);
// 释放buffer空间,PooledByteBuf的heap\direct实现底层是将本对象放到一个threadLocal对象暂存,等下次申请的时候重用此对象
buffer.release();
// 在申请一个32字节的buf
ByteBuf buffer1 = byalloc.buffer(32);
// 结果为true,两次是用一个对象
System.out.println(buffer == buffer1);
}
六、Netty中的零拷贝
1、ByteBuf
不管是heap还是direct,slice()等复制操作都是底层数组对象及DirectByteBuffer对象不变,复制后的ByteBuf与原先的对象共享内存区域,但是容量、读写index等属性独立
2、CompositeByteBuf
是多个ByteBuf的组合,合并为一个逻辑上的ByteBuf,在读写操作时依次读所有ByteBuf
3、FileRegion
FileChannel.tranferTo实现文件传输,这个源码中还未接触到