着急的同学,可以直接跳到第四节、第五节
从阻塞IO演进到netty
一、IO模型
0、前提
以输入为例,来讲解5种IO模型。一个输入操作通常包含两个不同的阶段:
1)等待数据准备好
2)从内核向进程复制数据
对于一个套接字上的输入操作,第一步通常涉及等待数据从网络到达。当所等待的分组到达时,它被复制到内核中的某个缓冲区。第二步就是把数据从内核缓冲区复制到应用进程缓冲区
1、阻塞式IO
阻塞式IO(blocking IO),进程在等待数据和将数据从内核复制到用户空间这两个阶段,都是阻塞的
进程调用recvfrom,其系统调用直到数据报到达且被复制到应用进程缓冲区中,或者发生错误(最常见的错误是,系统调用被信号中断)才返回。进程从调用recvfrom开始到它返回的整段时间内是被阻塞的。
2、非阻塞式IO
非阻塞式IO(nonblocking IO),进程在等待数据阶段,不阻塞,而是持续轮询(poll)内核,查看某个操作是否就绪。但是这样会消耗大量的CPU。
进程把一个套接字设为非阻塞,是在通知内核:当所请求的IO操作非得把该进程投入睡眠才能完成时,不要把该进程投入睡眠,而是返回一个错误。
3、IO复用(select和poll)
IO复用(IO multiplexing),调用select或poll,阻塞在这两个系统调用中的某一个之上,而不是阻塞在真正的IO系统调用。
我们阻塞于select调用,等待数据套接字变为可读。当select返回套接字可读这一条件时,我们调用recvfrom把所读的数据报复制到应用进程缓冲区。
4、信号驱动式IO
信号驱动式IO(signal-driven IO),让内核在描述符就绪时发送SIGIO信号通知我们
我们首先开启套接字的信号驱动式IO功能,并通过sigaction系统调用安装一个信号处理函数。该系统调用立即返回,我们的进程继续工作,也就是说它没有被阻塞。当数据报准备好读取时,内核就为该进程产生一个SIGIO信号。我们随后既可以在信号处理函数中调用recvfrom读取数据,并通知主循环数据已经准备好待处理,也可以立即通知主循环,让它读取数据报。
无论如何处理SIGIO信号,这种模型的优势在于等待数据报到达期间进程不被阻塞。主循环可以继续执行,只要等待来自信号处理函数的通知:既可以是数据已准备好被处理,也可以是数据报已准备好被读取
5、异步IO(POSIX的aio系列函数)
异步IO(asynchronous IO),告知内核启动某个操作,并让内核在整个操作(包括将数据从内核复制到我们自己的缓冲区)完成后通知我们。
与信号驱动式IO的主要区别在于:信号驱动式IO是由内核通知我们何时可以启动一个IO操作,而异步IO模型是由内核通知我们IO操作何时完成。
我们调用aio_read(POSIX异步IO函数以aio_或lio开头)函数,给内核传递描述符、缓冲区指针、缓冲区大小和文件偏移,并告诉内核当整个操作完成时如何通知我们。该系统调用立即返回,而且在等待IO完成期间,我们的进程不被阻塞。
6、五种IO模型的对比
前4种模型的主要区别在于第一阶段,因为它们的第二阶段是一样的:在数据从内核复制到调用者的缓冲区期间,进程阻塞于recvfrom调用。而异步IO模型在这两个阶段,进程都不阻塞。
同步、异步、阻塞、非阻塞,只关注IO,不关注IO读写完之后的事情
同步:应用程序自己将内核缓冲区的数据,搬运到应用程序缓冲区
异步:内核把内核缓冲区的数据,搬运到应用程序缓冲区
阻塞与非阻塞:当内核缓冲区还没有准备好数据时,此时进程是阻塞在那里,还是立刻返回
同步阻塞:程序自己读取内核缓冲区的数据到应用程序缓冲区,进行系统调用会一直等待有效返回结果
同步非阻塞:程序自己读取内核缓冲区的数据到应用程序缓冲区,进行系统调用的一瞬间,会给出是否可读(程序自己要解决下一次啥时候再去读)
7、参考资料
《UNIX网络编程 卷1:套接字联网API(第3版)》第6章I/O复用:select和poll函数 6.2 i/o模型
二、IO性能比较
1、buffer write为什么比基本write性能更高
static byte[] data = "123456789\n".getBytes();
static String path = "/root/oy/testfileio/out.txt";
//最基本的file写
public static void testBasicFileIO() throws Exception {
File file = new File(path);
FileOutputStream out = new FileOutputStream(file);
while(true){
out.write(data);
}
}
//测试buffer文件IO
// jvm 8kB syscall write(8KBbyte[])
public static void testBufferedFileIO() throws Exception {
File file = new File(path);
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(file));
while(true){
out.write(data);
}
}
buffer write为什么比基本write写得更快?因为buffer write是在jvm里面写满8kb的字节数组之后,才进行一次系统调用write。而基本的write每一次都是系统调用write。buffer write减少了系统调用,所以它的性能更高。
2、验证buffer write与基本write的性能
2.1 准备工作
在/root/oy/testfileio/,存在mysh.sh文件和OSFileIO.java文件,内容分别如下:
#!/bin/sh
rm -rf *out*
/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/bin/javac OSFileIO.java
strace -ff -o out /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/bin/java OSFileIO $1
第一行的意思是:声明是个sh脚本文件
第二行的意思是:删除当前目录下名字包含out的文件
第三行的意思是:编译OSFileIO为class文件
第四行的意思是:追踪OSFileIO程序里面的方法调用,保存在out(后缀为进程ID)文件
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class OSFileIO {
static byte[] data = "123456789\n".getBytes();
static String path = "/root/oy/testfileio/out.txt";
public static void main(String[] args) throws Exception {
switch ( args[0]) {
case "0" :
testBasicFileIO();
break;
case "1":
testBufferedFileIO();
break;
default:
}
}
//最基本的file写
public static void testBasicFileIO() throws Exception {
File file = new File(path);
FileOutputStream out = new FileOutputStream(file);
while(true){
out.write(data);
}
}
//测试buffer文件IO
// jvm 8kB syscall write(8KBbyte[])
public static void testBufferedFileIO() throws Exception {
File file = new File(path);
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(file));
while(true){
out.write(data);
}
}
}
2.2 查看基本write的性能
执行基本write
[root@localhost testfileio]# ./mysh.sh 0
一段时间后ctrl+c中止掉,再查看当前文件夹
[root@localhost testfileio]# ll -h
-rwxr-xr-x 1 root root 208 10月 13 20:41 mysh.sh
-rw-r--r-- 1 root root 2.9K 10月 14 08:33 OSFileIO.class
-rw-r--r-- 1 root root 3.4K 10月 13 21:26 OSFileIO.java
-rw-r--r-- 1 root root 14K 10月 14 08:33 out.89746
-rw-r--r-- 1 root root 29M 10月 14 08:33 out.89747
-rw-r--r-- 1 root root 902 10月 14 08:33 out.89748
-rw-r--r-- 1 root root 902 10月 14 08:33 out.89749
-rw-r--r-- 1 root root 4.8K 10月 14 08:33 out.89750
-rw-r--r-- 1 root root 1.1K 10月 14 08:33 out.89751
-rw-r--r-- 1 root root 1.1K 10月 14 08:33 out.89752
-rw-r--r-- 1 root root 2.4K 10月 14 08:33 out.89753
-rw-r--r-- 1 root root 5.5K 10月 14 08:33 out.89754
-rw-r--r-- 1 root root 3.7K 10月 14 08:33 out.89755
-rw-r--r-- 1 root root 862 10月 14 08:33 out.89756
-rw-r--r-- 1 root root 58K 10月 14 08:33 out.89757
-rw-r--r-- 1 root root 2.0K 10月 14 08:33 out.89773
-rw-r--r-- 1 root root 6.3M 10月 14 08:33 out.txt
可以看到最大的文件为out.89747,vim 打开它,再搜索123456789,可以看到它的每一次写都是进行write系统调用,写入123456789\n这10个字节
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
write(4, "123456789\n", 10) = 10
2.3 查看buffer write的性能
执行buffer write
[root@localhost testfileio]# ./mysh.sh 1
一段时间后ctrl+c中止掉,再查看当前文件夹
[root@localhost testfileio]# ll -h
-rwxr-xr-x 1 root root 208 10月 13 20:41 mysh.sh
-rw-r--r-- 1 root root 2.9K 10月 14 08:39 OSFileIO.class
-rw-r--r-- 1 root root 3.4K 10月 13 21:26 OSFileIO.java
-rw-r--r-- 1 root root 14K 10月 14 08:39 out.90096
-rw-r--r-- 1 root root 1.5M 10月 14 08:39 out.90097
-rw-r--r-- 1 root root 902 10月 14 08:39 out.90098
-rw-r--r-- 1 root root 902 10月 14 08:39 out.90099
-rw-r--r-- 1 root root 2.4K 10月 14 08:39 out.90100
-rw-r--r-- 1 root root 1.1K 10月 14 08:39 out.90101
-rw-r--r-- 1 root root 1.1K 10月 14 08:39 out.90102
-rw-r--r-- 1 root root 2.2K 10月 14 08:39 out.90103
-rw-r--r-- 1 root root 4.8K 10月 14 08:39 out.90104
-rw-r--r-- 1 root root 4.3K 10月 14 08:39 out.90105
-rw-r--r-- 1 root root 961 10月 14 08:39 out.90106
-rw-r--r-- 1 root root 7.9K 10月 14 08:39 out.90107
-rw-r--r-- 1 root root 1.9K 10月 14 08:39 out.90110
-rw-r--r-- 1 root root 169M 10月 14 08:39 out.txt
可以看到最大的文件为out.90097,vim 打开它,再搜索123456789,可以看到它是攒了8190个字节(即8K),再进行一次系统调用write。buffer write减少了系统调用,所以它的性能更高
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
futex(0x7f93d811bb54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f93d811bb50, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
futex(0x7f93d811bb54, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x7f93d811bb50, FUTEX_OP_SET<<28|0<<12|FUTEX_OP_CMP_GT<<24|0x1) = 1
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
3、MappedByteBuffer put不是系统调用
上面说到buffer write虽然减少了系统调用,但是还是需要系统调用的,才能让程序的data进行内核的page cache。而MappedByteBuffer put不是系统调用,但是数据会到达内核的page cache
下面进行验证
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class OSFileIO {
static byte[] data = "123456789\n".getBytes();
static String path = "/root/oy/testfileio/out.txt";
public static void main(String[] args) throws Exception {
switch ( args[0]) {
case "0" :
testBasicFileIO();
break;
case "1":
testBufferedFileIO();
break;
case "2":
testRandomAccessFileWrite();
default:
}
}
//最基本的file写
public static void testBasicFileIO() throws Exception {
File file = new File(path);
FileOutputStream out = new FileOutputStream(file);
while(true){
out.write(data);
}
}
//测试buffer文件IO
// jvm 8kB syscall write(8KBbyte[])
public static void testBufferedFileIO() throws Exception {
File file = new File(path);
BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(file));
while(true){
out.write(data);
}
}
public static void testRandomAccessFileWrite() throws Exception {
//会出现一个数字的文件描述符,指向path路径的文件
RandomAccessFile raf = new RandomAccessFile(path, "rw");
System.out.println("RandomAccessFile created------------");
System.in.read();//停在这里,方便我们查看此时的文件描述符
raf.write("hello world\n".getBytes());
System.out.println("write------------");
System.in.read();//停在这里,方便我们查看此时的文件描述符
FileChannel rafchannel = raf.getChannel();
//会出现一个mem的文件描述符,指向path路径的文件,且文件大小变成4096
MappedByteBuffer map = rafchannel.map(FileChannel.MapMode.READ_WRITE, 0, 4096);
System.out.println("MappedByteBuffer created------------");
System.in.read();//停在这里,方便我们查看此时的文件描述符
map.put("@@@".getBytes());//不是系统调用 但是数据会到达 内核的pagecache
System.out.println("map--put--------");
System.in.read(); //停在这里,方便我们查看此时的文件描述符
//map.force(); // flush page cache到硬盘
}
}
回到我们熟悉的目录 cd /root/oy/testfileio,执行testRandomAccessFileWrite方法
[root@localhost testfileio]# ./mysh.sh 2
RandomAccessFile created------------
由于System.in.read()的缘故,程序会停在这里等待我们输入。此时我们打开另一个窗口,回到同样的目录,然后创建新创建的文件的大小和文件描述符
[root@localhost ~]# cd /root/oy/testfileio/
[root@localhost testfileio]# ll -h
总用量 320K
-rwxr-xr-x 1 root root 208 10月 13 20:41 mysh.sh
-rw-r--r-- 1 root root 2.4K 10月 15 09:07 OSFileIO.class
-rw-r--r-- 1 root root 3.3K 10月 15 08:54 OSFileIO.java
-rw-r--r-- 1 root root 14K 10月 15 09:07 out.7206
-rw-r--r-- 1 root root 169K 10月 15 09:07 out.7207
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7208
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7209
-rw-r--r-- 1 root root 4.1K 10月 15 09:07 out.7210
-rw-r--r-- 1 root root 1.1K 10月 15 09:07 out.7211
-rw-r--r-- 1 root root 1.2K 10月 15 09:07 out.7212
-rw-r--r-- 1 root root 985 10月 15 09:07 out.7213
-rw-r--r-- 1 root root 3.8K 10月 15 09:07 out.7214
-rw-r--r-- 1 root root 2.7K 10月 15 09:07 out.7215
-rw-r--r-- 1 root root 831 10月 15 09:07 out.7216
-rw-r--r-- 1 root root 59K 10月 15 09:07 out.7217
-rw-r--r-- 1 root root 0 10月 15 09:07 out.txt
[root@localhost testfileio]# jps
7219 Jps
7206 OSFileIO
[root@localhost testfileio]# lsof -op 7206
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7206 root cwd DIR 253,0 100715659 /root/oy/testfileio
java 7206 root rtd DIR 253,0 64 /
java 7206 root txt REG 253,0 763141 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/bin/java
java 7206 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7206 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7206 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7206 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7206 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7206 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7206 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7206 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7206 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7206 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7206 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7206 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7206 root mem REG 253,0 34621259 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/lib/amd64/jli/libjli.so
java 7206 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7206 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7206 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7206 root mem REG 253,0 100715660 /tmp/hsperfdata_root/7206
java 7206 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 3r REG 253,0 0t65091162 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root 4u REG 253,0 0t0 101657177 /root/oy/testfileio/out.txt
可以看到刚创建的的out.txt文件大小为0,有一个文件描述符(4u,u是指可读可写)指向/root/oy/testfileio/out.txt
回到代码执行的窗口,按下回车,让程序继续执行
[root@localhost testfileio]# ./mysh.sh 2
RandomAccessFile created------------
write------------
此时执行完raf.write("hello world\n".getBytes()) 即文件中写入了12个字节。我们切换另一个窗口,进行查看
[root@localhost testfileio]# ll -h
总用量 4.6M
-rwxr-xr-x 1 root root 208 10月 13 20:41 mysh.sh
-rw-r--r-- 1 root root 2.4K 10月 15 09:07 OSFileIO.class
-rw-r--r-- 1 root root 3.3K 10月 15 08:54 OSFileIO.java
-rw-r--r-- 1 root root 14K 10月 15 09:07 out.7206
-rw-r--r-- 1 root root 169K 10月 15 09:20 out.7207
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7208
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7209
-rw-r--r-- 1 root root 149K 10月 15 09:21 out.7210
-rw-r--r-- 1 root root 1.1K 10月 15 09:07 out.7211
-rw-r--r-- 1 root root 1.2K 10月 15 09:07 out.7212
-rw-r--r-- 1 root root 985 10月 15 09:07 out.7213
-rw-r--r-- 1 root root 33K 10月 15 09:21 out.7214
-rw-r--r-- 1 root root 32K 10月 15 09:21 out.7215
-rw-r--r-- 1 root root 831 10月 15 09:07 out.7216
-rw-r--r-- 1 root root 2.9M 10月 15 09:21 out.7217
-rw-r--r-- 1 root root 12 10月 15 09:20 out.txt
[root@localhost testfileio]# lsof -op 7206
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7206 root cwd DIR 253,0 100715659 /root/oy/testfileio
java 7206 root rtd DIR 253,0 64 /
java 7206 root txt REG 253,0 763141 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/bin/java
java 7206 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7206 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7206 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7206 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7206 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7206 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7206 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7206 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7206 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7206 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7206 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7206 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7206 root mem REG 253,0 34621259 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/lib/amd64/jli/libjli.so
java 7206 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7206 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7206 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7206 root mem REG 253,0 100715660 /tmp/hsperfdata_root/7206
java 7206 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 3r REG 253,0 0t65091162 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root 4u REG 253,0 0t12 101657177 /root/oy/testfileio/out.txt
可以看到out.txt的大小为12,文件描述符4u的偏移量为0t12
回到程序执行的窗口,按下回车,让程序继续执行
[root@localhost testfileio]# ./mysh.sh 2
RandomAccessFile created------------
write------------
MappedByteBuffer created------------
此时执行完MappedByteBuffer map = rafchannel.map(FileChannel.MapMode.READ_WRITE, 0, 4096) ,会出现一个mem的文件描述符,指向/root/oy/testfileio/out.txt文件,且文件大小变成4096(4K)
[root@localhost testfileio]# ll -h
总用量 4.8M
-rwxr-xr-x 1 root root 208 10月 13 20:41 mysh.sh
-rw-r--r-- 1 root root 2.4K 10月 15 09:07 OSFileIO.class
-rw-r--r-- 1 root root 3.3K 10月 15 08:54 OSFileIO.java
-rw-r--r-- 1 root root 14K 10月 15 09:07 out.7206
-rw-r--r-- 1 root root 191K 10月 15 09:23 out.7207
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7208
-rw-r--r-- 1 root root 871 10月 15 09:07 out.7209
-rw-r--r-- 1 root root 171K 10月 15 09:23 out.7210
-rw-r--r-- 1 root root 1.1K 10月 15 09:07 out.7211
-rw-r--r-- 1 root root 1.2K 10月 15 09:07 out.7212
-rw-r--r-- 1 root root 985 10月 15 09:07 out.7213
-rw-r--r-- 1 root root 38K 10月 15 09:23 out.7214
-rw-r--r-- 1 root root 37K 10月 15 09:23 out.7215
-rw-r--r-- 1 root root 831 10月 15 09:07 out.7216
-rw-r--r-- 1 root root 3.3M 10月 15 09:23 out.7217
-rw-r--r-- 1 root root 4.0K 10月 15 09:23 out.txt
[root@localhost testfileio]# lsof -op 7206
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7206 root cwd DIR 253,0 100715659 /root/oy/testfileio
java 7206 root rtd DIR 253,0 64 /
java 7206 root txt REG 253,0 763141 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/bin/java
java 7206 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7206 root mem REG 253,0 739087 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libnio.so
java 7206 root mem REG 253,0 739086 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libnet.so
java 7206 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7206 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7206 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7206 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7206 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7206 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7206 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7206 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7206 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7206 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7206 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7206 root mem REG 253,0 34621259 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/lib/amd64/jli/libjli.so
java 7206 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7206 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7206 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7206 root mem REG 253,0 100715660 /tmp/hsperfdata_root/7206
java 7206 root mem REG 253,0 101657177 /root/oy/testfileio/out.txt
java 7206 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7206 root 3r REG 253,0 0t65034674 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7206 root 4u REG 253,0 0t12 101657177 /root/oy/testfileio/out.txt
java 7206 root 5u unix 0xffff9c9bb7986800 0t0 37771 socket
4、ByteBuffer与MappedByteBuffer
在linux系统中,启动了一个java进程,它存在txt代码段、data数据段、head堆、stack栈、mem内存地址映射。其中heap是指linux给进程分配的堆,jvm的堆占其中的一部分。
当调用ByteBuffer.allocate()方法时(即创建HeapByteBuffer),是在jvm堆上(on heap)开辟一个字节数组;当调用ByteBuffer.allocateDirect()方法时(即创建DirectByteBuffer),是在jvm堆外(java进程堆内)开辟一个字节数组。它们两者想要把数据写到内核的page cache,或者从内核的page cache读取数据,都需要经过系统调用。
这两者的区别在于:DirectByteBuffer操作的字节数组是通过操作系统本地代码创建的,对于java来说创建和销毁DirectByteBuffer更消耗性能。而HeapByteBuffer内部是直接创建的java数组,对于java来说更快。ByteBuffer.allocate分配的字节数组,如果想写到内核的page cache,必须先复制到jvm堆外,再经过write系统调用。如果想读内核的page cache,也必须先读到jvm堆外,再复制到jvm堆内。但是呢,
当调用FileChannel.map()方法创建MappedByteBuffer时,实际上是经过了mmap系统调用,将这个字节数组的逻辑地址与内核的page cache映射起来。创建完MappedByteBuffer之后,通过查看java进程的文件描述符,可以看到一个mem,指向到相应的文件(在上面我们已经看见过)。以后通过MappedByteBuffer.put写数据时,不需要经过系统调用,数据就可以直接写到内核的page cache
三、Socket IO
1、几个疑问
-
new一个ServerSocket,然后绑定端口号之后,发生了什么?
-
new一个Socket,可以和ServerSocket建立连接,三次握手是发生在这里吗?ServerSocket不调用accept的话,还能建立连接吗?客户端发送的消息能到达服务端吗?
2、准备工作
为了验证这些问题,我们需要做一些准备工作
2.1 服务端代码
一台虚拟机(我的ip是192.168.220.158)作为服务端,在它的/root/oy/testsocket目录下,放置SocketIOPropertites.java文件。文件内容如下:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.InetSocketAddress;
import java.net.ServerSocket;
import java.net.Socket;
import java.net.StandardSocketOptions;
/**
* BIO 多线程的方式
*/
public class SocketIOPropertites {
//server socket listen property:
private static final int RECEIVE_BUFFER = 10;
private static final int SO_TIMEOUT = 0;
private static final boolean REUSE_ADDR = false;
private static final int BACK_LOG = 2;
//client socket listen property on server endpoint:
private static final boolean CLI_KEEPALIVE = false;
private static final boolean CLI_OOB = false;
private static final int CLI_REC_BUF = 20;
private static final boolean CLI_REUSE_ADDR = false;
private static final int CLI_SEND_BUF = 20;
private static final boolean CLI_LINGER = true;
private static final int CLI_LINGER_N = 0;
private static final int CLI_TIMEOUT = 0;
private static final boolean CLI_NO_DELAY = false;
/*
StandardSocketOptions.TCP_NODELAY
StandardSocketOptions.SO_KEEPALIVE
StandardSocketOptions.SO_LINGER
StandardSocketOptions.SO_RCVBUF
StandardSocketOptions.SO_SNDBUF
StandardSocketOptions.SO_REUSEADDR
*/
public static void main(String[] args) {
ServerSocket server = null;
try {
server = new ServerSocket();
server.bind(new InetSocketAddress(9090), BACK_LOG);
server.setReceiveBufferSize(RECEIVE_BUFFER);
server.setReuseAddress(REUSE_ADDR);
server.setSoTimeout(SO_TIMEOUT);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("server up use 9090!");
while (true) {
try {
System.in.read(); //等待输入,目的是阻塞在这里,方便我们观察accept执行之前的情况
Socket client = server.accept();
System.out.println("client port: " + client.getPort());
client.setKeepAlive(CLI_KEEPALIVE);
client.setOOBInline(CLI_OOB);
client.setReceiveBufferSize(CLI_REC_BUF);
client.setReuseAddress(CLI_REUSE_ADDR);
client.setSendBufferSize(CLI_SEND_BUF);
client.setSoLinger(CLI_LINGER, CLI_LINGER_N);
client.setSoTimeout(CLI_TIMEOUT);
client.setTcpNoDelay(CLI_NO_DELAY);
new Thread(
() -> {
while (true) {
try {
InputStream in = client.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
char[] data = new char[1024];
int num = reader.read(data);
if (num > 0) {
System.out.println("client read some data is :" + num + " val :" + new String(data, 0, num));
} else if (num == 0) {
System.out.println("client readed nothing!");
continue;
} else {
System.out.println("client readed -1...");
client.close();
break;
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
).start();
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
server.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
2.2 客户端代码
另一台虚拟机(我的ip是192.168.220.157)作为客户端,在它的/root/oy/testsocket目录下,放置SocketIOPropertites.java文件。文件内容如下:
import java.io.*;
import java.net.Socket;
public class SocketClient {
public static void main(String[] args) {
try {
//由于我的服务端虚拟机的ip为192.168.220.158,所以这里填了这个ip
Socket client = new Socket("192.168.220.158",9090);
client.setSendBufferSize(20);
client.setTcpNoDelay(true);
OutputStream out = client.getOutputStream();
InputStream in = System.in;
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
while(true){
String line = reader.readLine();
if(line != null ){
byte[] bb = line.getBytes();
for (byte b : bb) {
out.write(b);
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
3、服务端开辟三个窗口
3.1 第一个窗口用于抓包
在运行服务端代码之前,先执行tcpdump
[root@localhost network-scripts]# tcpdump -nn -i ens33 port 9090
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
ens33为虚拟机配置的网卡,可以通过/etc/sysconfig/network-scripts/查看自己的网卡名称是什么。如下,ifcfg-ens33的DEVICE为ens33
root@localhost testsocket]# cd /etc/sysconfig/network-scripts/
[root@localhost network-scripts]# ls
ifcfg-ens33 ifdown-bnep ifdown-ipv6 ifdown-ppp ifdown-Team ifup ifup-eth ifup-isdn ifup-post ifup-sit ifup-tunnel network-functions
ifcfg-lo ifdown-eth ifdown-isdn ifdown-routes ifdown-TeamPort ifup-aliases ifup-ippp ifup-plip ifup-ppp ifup-Team ifup-wireless network-functions-ipv6
ifdown ifdown-ippp ifdown-post ifdown-sit ifdown-tunnel ifup-bnep ifup-ipv6 ifup-plusb ifup-routes ifup-TeamPort init.ipv6-global
[root@localhost network-scripts]# vim ifcfg-ens33
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="static"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="7b70e54a-e4ce-4c43-ba0a-b55f5439e617"
DEVICE="ens33"
ONBOOT="yes"
IPADDR="192.168.220.158"
NERMASK="255.255.255.0"
GATEWAY="192.168.220.2"
DNS1="202.96.134.133"
DNS2="202.96.128.68"
DNS3="114.114.114.114"
3.2 第二个窗口用于显示网络连接和文件描述符
在运行服务端代码之前,先查看网络连接情况。
[root@localhost ~]# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6537/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 6767/master
tcp 0 0 192.168.220.158:22 192.168.220.34:63430 ESTABLISHED 7465/sshd: root@pts
tcp 0 0 192.168.220.158:22 192.168.220.34:63611 ESTABLISHED 7497/sshd: root@pts
tcp 0 52 192.168.220.158:22 192.168.220.34:63615 ESTABLISHED 7516/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 6537/sshd
tcp6 0 0 ::1:25 :::* LISTEN 6767/master
后续执行服务端代码的途中,我们也会经常查看网络连接情况和文件描述符
3.3 第三个窗口用于执行服务端代码
[root@localhost testsocket]# javac SocketIOPropertites.java && java SocketIOPropertites
server up use 9090!
我们知道此时服务端停在了System.in.read(),方便我们观察accept执行之前的情况
3.4 回到第二个窗口,查看服务端启动之后的情况
查看网络连接情况
[root@localhost ~]# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6537/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 6767/master
tcp 0 0 192.168.220.158:22 192.168.220.34:63430 ESTABLISHED 7465/sshd: root@pts
tcp 0 0 192.168.220.158:22 192.168.220.34:63611 ESTABLISHED 7497/sshd: root@pts
tcp 0 52 192.168.220.158:22 192.168.220.34:63615 ESTABLISHED 7516/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 6537/sshd
tcp6 0 0 ::1:25 :::* LISTEN 6767/master
tcp6 0 0 :::9090 :::* LISTEN 7954/java
可以看到多了一个LISTEN状态的tcp连接,监听本地端口9090,远程地址任意,说明它是serverSocket。PID为7954,那么我们来看看7954是不是我们的java进程
[root@localhost ~]# jps
7954 SocketIOPropertites
7970 Jps
可以看到7954确实是我们启动的服务端进程,那么我们再来看看此时它的文件描述符
[root@localhost ~]# lsof -op 7954
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7954 root cwd DIR 253,0 100715660 /root/oy/testsocket
java 7954 root rtd DIR 253,0 64 /
java 7954 root txt REG 253,0 67673920 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/bin/java
java 7954 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7954 root mem REG 253,0 739086 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libnet.so
java 7954 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7954 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7954 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7954 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7954 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7954 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7954 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7954 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7954 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7954 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7954 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7954 root mem REG 253,0 67673931 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/jli/libjli.so
java 7954 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7954 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7954 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7954 root mem REG 253,0 101657157 /tmp/hsperfdata_root/7954
java 7954 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 3r REG 253,0 0t69120089 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root 4u unix 0xffff9c9b9faa8c00 0t0 102521 socket
java 7954 root 5u IPv6 102523 0t0 TCP *:websm (LISTEN)
可以看到文件描述符5u,指向了一个LISTEN状态的TCP连接,也就是我们的ServerSocket
3.5 回到第一个窗口,查看抓包情况
我们只启动了服务端,没有启动客户端,按理来说是抓不到什么的。确实是这样,它还卡在这里,什么也没抓到
[root@localhost network-scripts]# clear
[root@localhost network-scripts]# tcpdump -nn -i ens33 port 9090
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
4、执行客户端代码
[root@localhost testsocket]# javac SocketClient.java && java SocketClient
执行成功,也等待我们输入。因为它的InputStream被我们设计成System.in
5、3次握手
启动客户端之后,我们在服务端抓包的窗口能看到3次握手。刚监听时,是停在第3行。客户端启动之后,第4行及之后是自动输出的。可以看到3次握手成功
[root@localhost network-scripts]# tcpdump -nn -i ens33 port 9090
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
11:53:11.325001 IP 192.168.220.157.43668 > 192.168.220.158.9090: Flags [S], seq 3502353609, win 29200, options [mss 1460,sackOK,TS val 11453809 ecr 0,nop,wscale 7], length 0
11:53:11.325023 IP 192.168.220.158.9090 > 192.168.220.157.43668: Flags [S.], seq 314809214, ack 3502353610, win 1152, options [mss 1460,sackOK,TS val 50937643 ecr 11453809,nop,wscale 0], length 0
11:53:11.325257 IP 192.168.220.157.43668 > 192.168.220.158.9090: Flags [.], ack 1, win 229, options [nop,nop,TS val 11453810 ecr 50937643], length 0
6、查看网络连接情况
[root@localhost ~]# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6537/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 6767/master
tcp 0 0 192.168.220.158:22 192.168.220.34:63430 ESTABLISHED 7465/sshd: root@pts
tcp 0 0 192.168.220.158:22 192.168.220.34:63611 ESTABLISHED 7497/sshd: root@pts
tcp 0 52 192.168.220.158:22 192.168.220.34:63615 ESTABLISHED 7516/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 6537/sshd
tcp6 0 0 ::1:25 :::* LISTEN 6767/master
tcp6 1 0 :::9090 :::* LISTEN 7954/java
tcp6 0 0 192.168.220.158:9090 192.168.220.157:43668 ESTABLISHED -
可以看到新增了一个状态为ESTABLISHED网络连接,只是PID那一列还是空着呢?因为三次握手成功,就可以建立连接,所以新增了这条网络连接。但是ServerSocket还没有调用accept,这条网络连接还没有被分配给进程。所以PID这一列空着。
我们来看看此时文件描述符的情况
[root@localhost ~]# lsof -op 7954
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7954 root cwd DIR 253,0 100715660 /root/oy/testsocket
java 7954 root rtd DIR 253,0 64 /
java 7954 root txt REG 253,0 67673920 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/bin/java
java 7954 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7954 root mem REG 253,0 739086 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libnet.so
java 7954 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7954 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7954 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7954 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7954 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7954 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7954 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7954 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7954 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7954 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7954 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7954 root mem REG 253,0 67673931 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/jli/libjli.so
java 7954 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7954 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7954 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7954 root mem REG 253,0 101657157 /tmp/hsperfdata_root/7954
java 7954 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 3r REG 253,0 0t69120089 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root 4u unix 0xffff9c9b9faa8c00 0t0 102521 socket
java 7954 root 5u IPv6 102523 0t0 TCP *:websm (LISTEN)
可以发现7954这个java进程,确实没有文件描述符指向新建立的连接。那么此时客户端向服务端发送数据的话,服务端能收到吗?
回到客户端窗口,输入1111。再回到服务端窗口,查看网络连接情况
[root@localhost ~]# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6537/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 6767/master
tcp 0 0 192.168.220.158:22 192.168.220.34:63430 ESTABLISHED 7465/sshd: root@pts
tcp 0 0 192.168.220.158:22 192.168.220.34:63611 ESTABLISHED 7497/sshd: root@pts
tcp 0 52 192.168.220.158:22 192.168.220.34:63615 ESTABLISHED 7516/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 6537/sshd
tcp6 0 0 ::1:25 :::* LISTEN 6767/master
tcp6 1 0 :::9090 :::* LISTEN 7954/java
tcp6 4 0 192.168.220.158:9090 192.168.220.157:43668 ESTABLISHED -
可以看到接收缓存区Recv-Q 已经缓存了4个字节的数据,所以我们可以说建立连接之后是可以传输数据的。我们现在回到服务端执行的窗口,按下回车,让服务端执行accpet及之后的代码
[root@localhost testsocket]# javac SocketIOPropertites.java && java SocketIOPropertites
server up use 9090!
client port: 43668
client read some data is :4 val :1111
可以看到缓存区中的数据1111,被读取到了。我们再来看看此时的网络情况
[root@localhost ~]# netstat -natp
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 6537/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 6767/master
tcp 0 0 192.168.220.158:22 192.168.220.34:63430 ESTABLISHED 7465/sshd: root@pts
tcp 0 0 192.168.220.158:22 192.168.220.34:63611 ESTABLISHED 7497/sshd: root@pts
tcp 0 52 192.168.220.158:22 192.168.220.34:63615 ESTABLISHED 7516/sshd: root@pts
tcp6 0 0 :::22 :::* LISTEN 6537/sshd
tcp6 0 0 ::1:25 :::* LISTEN 6767/master
tcp6 0 0 192.168.220.158:9090 192.168.220.157:43668 ESTABLISHED 7954/java
可以看到状态为ESTABLISHED的网络连接,PID这一列也被填充了。(因为执行了ServerSokcet.accept方法嘛)我们再来看看文件描述符的情况
[root@localhost ~]# lsof -op 7954
COMMAND PID USER FD TYPE DEVICE OFFSET NODE NAME
java 7954 root cwd DIR 253,0 100715660 /root/oy/testsocket
java 7954 root rtd DIR 253,0 64 /
java 7954 root txt REG 253,0 67673920 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/bin/java
java 7954 root mem REG 253,0 33589367 /usr/lib/locale/locale-archive
java 7954 root mem REG 253,0 739086 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libnet.so
java 7954 root mem REG 253,0 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root mem REG 253,0 739094 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libzip.so
java 7954 root mem REG 253,0 33814081 /usr/lib64/libnss_files-2.17.so
java 7954 root mem REG 253,0 739076 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libjava.so
java 7954 root mem REG 253,0 739093 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/libverify.so
java 7954 root mem REG 253,0 33594518 /usr/lib64/librt-2.17.so
java 7954 root mem REG 253,0 34209721 /usr/lib64/libgcc_s-4.8.5-20150702.so.1
java 7954 root mem REG 253,0 33594503 /usr/lib64/libm-2.17.so
java 7954 root mem REG 253,0 33595574 /usr/lib64/libstdc++.so.6.0.19
java 7954 root mem REG 253,0 100721148 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/server/libjvm.so
java 7954 root mem REG 253,0 33594493 /usr/lib64/libc-2.17.so
java 7954 root mem REG 253,0 33594500 /usr/lib64/libdl-2.17.so
java 7954 root mem REG 253,0 67673931 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/amd64/jli/libjli.so
java 7954 root mem REG 253,0 33595568 /usr/lib64/libz.so.1.2.7
java 7954 root mem REG 253,0 33814089 /usr/lib64/libpthread-2.17.so
java 7954 root mem REG 253,0 33589366 /usr/lib64/ld-2.17.so
java 7954 root mem REG 253,0 101657157 /tmp/hsperfdata_root/7954
java 7954 root 0u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 1u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 2u CHR 136,0 0t0 3 /dev/pts/0
java 7954 root 3r REG 253,0 0t31610475 100721168 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.212.b04-0.el7_6.x86_64/jre/lib/rt.jar
java 7954 root 4u unix 0xffff9c9b9faa8c00 0t0 102521 socket
java 7954 root 6u IPv6 103908 0t0 TCP localhost.localdomain:websm->192.168.220.157:43668 (ESTABLISHED)
可以看到新增文件描述符6u,指向一个状态为ESTABLISHED的TCP连接
7、Socket总结
在TCP的三次握手之后,服务端、客户端之间建立了连接,然后进行了资源的分配。比如说发送缓冲区Send-Q、接收缓冲区Recv-Q 等。建立连接之后,服务端、客户端都会出现一个socket,它是一个内核级的对象。在没有调用ServerSocket的accept方法之前,它就存在了。
socket是什么呢?socket是一个四元组(服务端IP、服务端端口号、客户端IP、客户端端口号),它唯一地标识了一个连接。假设服务端的linux系统限制了最大的socket连接数为65536,它上面运行着两个进程,端口号分配为8080、9090,那它能支持的最大socket数是指多个进程最多65535个,还是一个进程最多65535个socket?是指一个进程最多65535个socket
四、IO模型的性能
1、BIO模型的性能瓶颈
public static void main(String[] args) throws Exception {
ServerSocket server = new ServerSocket(9090,20);
System.out.println("step1: new ServerSocket(9090) ");
while (true) {
Socket client = server.accept(); //阻塞1
System.out.println("step2:client\t" + client.getPort());
new Thread(() -> {
InputStream in = null;
try {
in = client.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
while(true){
String dataline = reader.readLine(); //阻塞2
if(null != dataline){
System.out.println(dataline);
}else{
client.close();
break;
}
}
System.out.println("客户端断开");
} catch (IOException e) {
e.printStackTrace();
}
}).start();
}
}
在BIO模型中,ServerSocket.accept会产生一次系统调用accept,且阻塞在那,直到有客户端连接到达。客户端连接到达之后,会产生一个中断,内核就知道连接到达了。然后内核会创建一个socket,它的accept方法会分配一个文件描述符指向这个socket,java进程可以通过这个文件描述符操作socket。socket.read方法也会产生一次系统调用recv,且阻塞在那,这样while循环里后续的accept就没办法执行了,其他客户端就没办法与服务端建立连接了。为了解决这个问题,新建了一个个线程(也会产生系统调用clone),去调用socket.read方法,不影响主线程的accept。
BIO的瓶颈在于:
-
accept、recv、clone系统调用,比较消耗时间,且会阻塞
-
线程消耗资源
2、NIO模型的性能瓶颈
public static void main(String[] args) throws Exception {
LinkedList<SocketChannel> clients = new LinkedList<>();
ServerSocketChannel ss = ServerSocketChannel.open();
ss.bind(new InetSocketAddress(9090));
ss.configureBlocking(false); //重点 OS NONBLOCKING!!!
ss.setOption(StandardSocketOptions.TCP_NODELAY, false);
while (true) {
Thread.sleep(1000);
SocketChannel client = ss.accept(); //不会阻塞? -1NULL
if (client == null) {
System.out.println("null.....");
} else {
client.configureBlocking(false);
int port = client.socket().getPort();
System.out.println("client...port: " + port);
clients.add(client);
}
ByteBuffer buffer = ByteBuffer.allocateDirect(4096); //可以在堆里 堆外
for (SocketChannel c : clients) { //串行化!!!! 多线程!!
int num = c.read(buffer); // >0 -1 0 //不会阻塞
if (num > 0) {
buffer.flip();
byte[] aaa = new byte[buffer.limit()];
buffer.get(aaa);
String b = new String(aaa);
System.out.println(c.socket().getPort() + " : " + b);
buffer.clear();
}
}
}
}
在NIO模型中,虽然ServerSocketChannel.accept也会产生一次系统调用accept,但是由于配置了ServerSocketChannel.configureBlocking(false),所以当客户端没有到达时,内核的accept方法不会阻塞,而是立刻返回-1(java对内核的返回值进行了包装,返回null)。有个while循环,所以它会产生一次次的系统调用accept去询问是否有客户端到达。
客户端连接到达之后,会产生一个中断,内核就知道连接到达了,然后内核会创建一个socket。下一次的系统调用accept到来,内核就立刻分配一个文件描述符指向这个socket,java进程可以通过这个文件描述符操作socket。
虽然SocketChannel.read也会产生一次系统调用recv,但是由于配置了SocketChannel.configureBlocking(false),所以当服务端没有接受到客户端的数据时,recv也不会阻塞,而是立刻返回-1。有个while循环,所以它会产生一次次的系统调用recv去询问是否有数据到达。
NIO模型的瓶颈在于:
-
accept、recv系统调用比较耗时间,它会产生多次不必要的系统调用
相对于BIO模型来说,NIO优势是:不消耗多余的线程资源,也不产生clone系统调用
3、多路复用器的性能瓶颈
一个IO当成一条路,NIO是全量遍历每一条路询问IO的状态,会产生N次系统调用。多路复用器,则是一次系统调用,询问所有的IO的状态,所以才被称作多路复用。
3.1 select、poll的性能瓶颈
多路复用器,有select、poll、epoll模型。我们先来看看select
[root@localhost ~]# man 2 select
如果显示没有man命令或select查不到,那么需要
[root@localhost ~]# yum install man man-pages
下面是对select函数说明的一些截取,它需要传入一些文件描述符(fd)的集合,告知内核去遍历那些fd对应的IO的状态。不过传入的fd的数量有限制。
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
select() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corresponding I/O operation (e.g., read(2)) without block‐ing
An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor
Three independent sets of file descriptors are watched. Those listed in readfds will be watched to see if characters become available for reading (more precisely, to see if a read will not block; in particular, a file descriptor is also ready on end-of-file), those in writefds will be watched to see if a write will not block, and those in exceptfds will be watched for exceptions. On exit, the sets are modified in place to indicate which file descriptors actually changed status. Each of the three file descriptor sets may be specified as NULL if no file descriptors are to be watched for the corresponding class of events.
Four macros are provided to manipulate the sets. FD_ZERO() clears a set. FD_SET() and FD_CLR() respectively add and remove a given file descriptor from a set.
FD_ISSET() tests to see if a file descriptor is part of the set; this is useful after select() returns.
nfds is the highest-numbered file descriptor in any of the three sets, plus 1.
The timeout argument specifies the minimum interval that select() should block waiting for a file descriptor to become ready. (This interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount.) If both fields of the timeval structure are zero, then select() returns immediately. (This is useful for polling.) If timeout is NULL (no timeout), select() can block indefinitely.
我们再来看看poll模型
[root@localhost ~]# man 2 poll
下面是对poll函数说明的一些截取,相较于select来说,它没有文件描述符的数量限制
int poll(struct pollfd *fds, nfds_t nfds, int timeout);
poll() performs a similar task to select(2): it waits for one of a set of file descriptors to become ready to perform I/O.
The set of file descriptors to be monitored is specified in the fds argument, which is an array of structures of the following form:
struct pollfd {
int fd; /* file descriptor */
short events; /* requested events */
short revents; /* returned events */
};
The caller should specify the number of items in the fds array in nfds.
The field fd contains a file descriptor for an open file. If this field is negative, then the corresponding events field is ignored and the revents field returns zero. (This provides an easy way of ignoring a file descriptor for a single poll() call: simply negate the fd field.)
The field events is an input parameter, a bit mask specifying the events the application is interested in for the file descriptor fd. If this field is specified as zero, then all events are ignored for fd and revents returns zero.
The field revents is an output parameter, filled by the kernel with the events that actually occurred. The bits returned in revents can include any of those specified in events, or one of the values POLLERR, POLLHUP, or POLLNVAL. (These three bits are meaningless in the events field, and will be set in the revents field whenever the corresponding condition is true.)
If none of the events requested (and no error) has occurred for any of the file descriptors, then poll() blocks until one of the events occurs.
The timeout argument specifies the minimum number of milliseconds that poll() will block. (This interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount.) Specifying a negative value in timeout means an infinite timeout. Specifying a timeout of zero causes poll() to return immediately, even if no file descriptors are ready.
不管是NIO,还是多路复用器的select、poll,都是需要遍历所有的IO,去询问状态。只不过NIO的遍历,是发起N次系统调用,每次询问一条IO的状态。而多路复用的select、poll是发起一次系统调用,并传入需要遍历的集合,让内核去遍历它们询问状态。
多路复用器select、poll的瓶颈:
-
每次都要重新、重复传递fds集合
-
相较于NIO来说,多路复用器select、poll减少了系统调用。但是一次系统调用过程中,内核需要遍历传入的fds全量。这些fds对应的IO,可能不是ready状态,也就是说还是会有不必要的消耗。
3.2 epoll
我们先了解一下,IO是如何变成可读状态的:当有数据到达网卡时,会产生一个中断。这个中断会调用回调函数,将网卡的数据通过内核网络协议栈(2、3、4层协议),最终关联到文件描述符fd(指向某个IO)的buffer,并更改fd的状态。所以,某一时间,如果应用程序通过系统调用,询问内核是否存在某个或某些fd是可读可写时,会有状态返回。
那什么是epoll呢:
-
内核开辟一个epoll空间,构建一个红黑树,用于存放应用程序运行过程中用到的fds
-
应用程序将一些fd放入空间,注册感兴趣的事件(accept、read、write)
-
当有数据到达网卡时,会产生一个中断。这个中断会调用回调函数,将网卡的数据通过内核网络协议栈(2、3、4层协议),最终关联到文件描述符fd(指向某个IO)的buffer,并更改fd的状态。然后将有状态的fd复制到内核的一个链表中
-
应用程序发起系统调用,向内核询问IO状态时,内核直接返回整个链表,不再需要内核遍历这些fd,去询问状态。也就是说将内核的遍历,分散到事件的到达
下图是epoll与select的对比
然后我们来探索一下epoll中的函数
epoll_create开辟epoll空间,构建一个红黑树,返回一个文件描述符(假设是fd6)指向这些空间,用于存放应用程序运行过程中用到的fds
[root@localhost ~]# man 2 epoll_create
下面是对epoll_create函数说明的一些截取
int epoll_create(int size);
epoll_create() returns a file descriptor referring to the new epoll instance. This file descriptor is used for all the subsequent calls to the epoll interface. When no longer required, the file descriptor returned by epoll_create() should be closed by using close(2). When all file descriptors referring to an epoll instance have been closed, the kernel destroys the instance and releases the associated resources for reuse.
In the initial epoll_create() implementation, the size argument informed the kernel of the number of file descriptors that the caller expected to add to the epoll instance. The kernel used this information as a hint for the amount of space to initially allocate in internal data structures describing events. (If necessary, the kernel would allocate more space if the caller‘s usage exceeded the hint given in size. ) Nowadays, this hint is no longer required (the kernel dynamically sizes the required data structures without needing the hint), but size must still be greater than zero, in order to ensure backward compatibility when new epoll applications are run on older kernels.
epoll_ctl 的作用是:将fd加入到epoll空间、更改epoll空间中fd感兴趣的事件、将fd从epoll空间移除
[root@localhost ~]# man 2 epoll_ctl
下面是对epoll_ctl函数说明的一些截取
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
This system call performs control operations on the epoll(7) instance referred to by the file descriptor epfd. It requests that the operation op be performed for the target file descriptor, fd.
Valid values for the op argument are :
EPOLL_CTL_ADD
Register the target file descriptor fd on the epoll instance referred to by the file descriptor epfd and associate the event event with the internal file
linked to fd.
EPOLL_CTL_MOD
Change the event event associated with the target file descriptor fd.
EPOLL_CTL_DEL
Remove (deregister) the target file descriptor fd from the epoll instance referred to by epfd. The event is ignored and can be NULL (but see BUGS below).
epoll_wait 的作用是:向内核询问IO状态时,内核直接返回整个链表
[root@localhost ~]# man 2 epoll_wait
下面是对epoll_wait函数说明的一些截取
int epoll_wait(int epfd, struct epoll_event *events,
int maxevents, int timeout);
The epoll_wait() system call waits for events on the epoll(7) instance referred to by the file descriptor epfd. The memory area pointed to by events will contain the events that will be available for the caller. Up to maxevents are returned by epoll_wait(). The maxevents argument must be greater than zero.
The timeout argument specifies the minimum number of milliseconds that epoll_wait() will block. (This interval will be rounded up to the system clock granularity, and kernel scheduling delays mean that the blocking interval may overrun by a small amount.) Specifying a timeout of -1 causes epoll_wait() to block indefinitely, while specifying a timeout equal to zero cause epoll_wait() to return immediately, even if no events are available
When successful, epoll_wait() returns the number of file descriptors ready for the requested I/O, or zero if no file descriptor became ready during the requested timeout milliseconds. When an error occurs, epoll_wait() returns -1 and errno is set appropriately.
五、从多路复用器到reactor
1、单线程的多路复用器
先从一个单线程的多路复用器开始讲起
public class SocketMultiplexingSingleThreadv1 {
private ServerSocketChannel server = null;
private Selector selector = null;
int port = 9090;
public static void main(String[] args) {
SocketMultiplexingSingleThreadv1 service = new SocketMultiplexingSingleThreadv1();
service.start();
}
public void start() {
try {
//创建一个ServerSocket,即调用内核函数socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4 其中4为返回的文件描述符
server = ServerSocketChannel.open();
//ServerSocket配置为非阻塞,即调用内核函数fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 其中0表示调用成功
server.configureBlocking(false);
//将这个文件描述符绑定到9090端口,即调用内核函数bind(4, {sa_family=AF_INET, sin_port=htons(9090)}),然后监听9090端口,即调用内核函数listen(4,50)
server.bind(new InetSocketAddress(port));
//创建selector。如果是poll模型的话,就是在jvm开辟一个内存空间,用于存放fds(从上面我们分析的可知,poll模型本身是不会去开辟内存空间存放fds的,所以是java代码封装了一下,帮它开辟了);如果是epoll模型的话,就是调用内核函数epoll_create(256) = 7,在内核中开辟一个内存空间,用于存放fds
selector = Selector.open();
//将ServerSokcet的文件描述符放入空间,注册感兴趣的事件ACCEPT。如果是在poll模型,就是将文件描述符放入到jvm的内存空间。如果是在epoll模型,就是调用内核函数epoll_ctl(7, EPOLL_CTL_ADD, 4 其中文件描述符7指向epoll_create开辟的内存空间,文件描述符4指向ServerSocket。
server.register(selector, SelectionKey.OP_ACCEPT);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("服务器启动了。。。。。");
try {
while (true) {
Set<SelectionKey> keys = selector.keys();
System.out.println(keys.size()+" size");
//询问是否存在已就绪的fd,最大阻塞时间为500毫秒。如果是在poll模型,就是调用poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}], 2, -1) = 1 其中文件描述符5是个管道,不是我们关注的重点,忽略它。poll也就是轮询每个文件描述符,向内核询问它们是否就绪。如果是epoll模型,就是调用epoll_wait(7, {{EPOLLIN, {u32=4, u64=2216749036554158084}}}, 4096, -1) = 1
while (selector.select(500) > 0) {
Set<SelectionKey> selectionKeys = selector.selectedKeys();
Iterator<SelectionKey> iter = selectionKeys.iterator();
while (iter.hasNext()) {
SelectionKey key = iter.next();
iter.remove();
if (key.isAcceptable()) {
//如果accpet已经就绪,那么进行accept处理
acceptHandler(key);
} else if (key.isReadable()) {//如果可读,那么进行read处理
readHandler(key);
}
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
public void acceptHandler(SelectionKey key) {
try {
ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
//调用ServerSocket的accept分配一个文件描述符指向Socket。即调用内核函数accept(4, = 8 其中4为ServertSocket的文件描述符,8为Socket的文件描述符
SocketChannel client = ssc.accept();
//Socket配置为非阻塞,即调用内核函数fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0
client.configureBlocking(false);
ByteBuffer buffer = ByteBuffer.allocate(8192);
//将Sokcet的文件描述符放入空间,注册感兴趣的事件READ。如果是在poll模型,就是将文件描述符放入到jvm的内存空间。如果是在epoll模型,就是调用内核函数epoll_ctl(7, EPOLL_CTL_ADD, 8 其中文件描述符7指向epoll_create开辟的内存空间,文件描述符8指向Socket
client.register(selector, SelectionKey.OP_READ, buffer);
System.out.println("-------------------------------------------");
System.out.println("新客户端:" + client.getRemoteAddress());
System.out.println("-------------------------------------------");
} catch (IOException e) {
e.printStackTrace();
}
}
public void readHandler(SelectionKey key) {
SocketChannel client = (SocketChannel) key.channel();
ByteBuffer buffer = (ByteBuffer) key.attachment();
buffer.clear();
int read = 0;
try {
while (true) {
//读取内核缓冲区recv-Q的数据到buffer,即调用内核函数read
read = client.read(buffer);
if (read > 0) {
buffer.flip();
while (buffer.hasRemaining()) {
//将buffer中的数据,写入到内核缓冲区send-Q,即调用内核函数write。这段逻辑的意思是你给我发什么,我给你原样返回
client.write(buffer);
}
buffer.clear();
} else if (read == 0) {
break;
} else {
client.close();
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
单线程多路复用器的缺陷在于,某个fd对于read的处理(即readHandler方法)一旦比较耗时,就会影响到后面的fd的处理,甚至下一轮的selector.select。
那怎么解决这个问题呢?自然而然的,我们就想到把readHandler丢到另一个线程里去。这就是多线程多路复用器了。
2、多线程的多路复用器
public class SocketMultiplexingThreadsv1 {
private ServerSocketChannel server = null;
private Selector selector = null;
int port = 9090;
public static void main(String[] args) {
SocketMultiplexingThreadsv1 service = new SocketMultiplexingThreadsv1();
service.start();
}
public void start() {
try {
//创建一个ServerSocket,即调用内核函数socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 4 其中4为返回的文件描述符
server = ServerSocketChannel.open();
//ServerSocket配置为非阻塞,即调用内核函数fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 其中0表示调用成功
server.configureBlocking(false);
//将这个文件描述符绑定到9090端口,即调用内核函数bind(4, {sa_family=AF_INET, sin_port=htons(9090)}),然后监听9090端口,即调用内核函数listen(4,50)
server.bind(new InetSocketAddress(port));
//创建selector。如果是poll模型的话,就是在jvm开辟一个内存空间,用于存放fds(从上面我们分析的可知,poll模型本身是不会去开辟内存空间存放fds的,所以是java代码封装了一下,帮它开辟了);如果是epoll模型的话,就是调用内核函数epoll_create(256) = 7,在内核中开辟一个内存空间,用于存放fds
selector = Selector.open();
//将ServerSokcet的文件描述符放入空间,注册感兴趣的事件ACCEPT。如果是在poll模型,就是将文件描述符放入到jvm的内存空间。如果是在epoll模型,就是调用内核函数epoll_ctl(7, EPOLL_CTL_ADD, 4 其中文件描述符7指向epoll_create开辟的内存空间,文件描述符4指向ServerSocket。
server.register(selector, SelectionKey.OP_ACCEPT);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("服务器启动了。。。。。");
try {
while (true) {
Set<SelectionKey> keys = selector.keys();
System.out.println(keys.size()+" size");
//询问是否存在已就绪的fd,最大阻塞时间为500毫秒。如果是在poll模型,就是调用poll([{fd=5, events=POLLIN}, {fd=4, events=POLLIN}], 2, -1) = 1 其中文件描述符5是个管道,不是我们关注的重点,忽略它。poll也就是轮询每个文件描述符,向内核询问它们是否就绪。如果是epoll模型,就是调用epoll_wait(7, {{EPOLLIN, {u32=4, u64=2216749036554158084}}}, 4096, -1) = 1
while (selector.select(500) > 0) {
Set<SelectionKey> selectionKeys = selector.selectedKeys();
Iterator<SelectionKey> iter = selectionKeys.iterator();
while (iter.hasNext()) {
SelectionKey key = iter.next();
iter.remove();
if (key.isAcceptable()) {
//如果accpet已经就绪,那么进行accept处理
acceptHandler(key);
} else if (key.isReadable()) {//如果可读,那么进行read处理
//从空间中移除掉fd,如果是在epoll模型,那么就是调用epoll_ctl(7, EPOLL_CTL_DEL, 8 。因为readHandler是在另一个线程中执行,此处的while循环可以立马调用selector.select,如果readHandler没有读完缓冲区的数据,那么这个key又会被挑选出来。被另一个线程重复读取数据
key.cancel();
readHandler(key);
}else if(key.isWritable()){//如果可写,那么进行write处理
//只要send-Q没满,key.isWritable就会返回true。如果这里不调用cancel的话,每次循环,必然会进这个if语句,也就必然执行writeHandler,也就是一直在new 线程
key.cancel();
writeHandler(key);
}
}
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
public void acceptHandler(SelectionKey key) {
try {
ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
//调用ServerSocket的accept分配一个文件描述符指向Socket。即调用内核函数accept(4, = 8 其中4为ServertSocket的文件描述符,8为Socket的文件描述符
SocketChannel client = ssc.accept();
//Socket配置为非阻塞,即调用内核函数fcntl(8, F_SETFL, O_RDWR|O_NONBLOCK) = 0
client.configureBlocking(false);
ByteBuffer buffer = ByteBuffer.allocate(8192);
//将Sokcet的文件描述符放入空间,注册感兴趣的事件READ。如果是在poll模型,就是将文件描述符放入到jvm的内存空间。如果是在epoll模型,就是调用内核函数epoll_ctl(7, EPOLL_CTL_ADD, 8 其中文件描述符7指向epoll_create开辟的内存空间,文件描述符8指向Socket
client.register(selector, SelectionKey.OP_READ, buffer);
System.out.println("-------------------------------------------");
System.out.println("新客户端:" + client.getRemoteAddress());
System.out.println("-------------------------------------------");
} catch (IOException e) {
e.printStackTrace();
}
}
public void readHandler(SelectionKey key) {
new Thread(() -> {
SocketChannel client = (SocketChannel) key.channel();
ByteBuffer buffer = (ByteBuffer) key.attachment();
buffer.clear();
int read = 0;
try {
while (true) {
//读取内核缓冲区recv-Q的数据到buffer,即调用内核函数read
read = client.read(buffer);
if (read > 0) {
//将Sokcet的文件描述符放入空间,注册感兴趣的事件WRITE
client.register(key.selector(), SelectionKey.OP_WRITE, buffer);
} else if (read == 0) {
break;
} else {
client.close();
break;
}
}
} catch (IOException e) {
e.printStackTrace();
}
}).start();
}
private void writeHandler(SelectionKey key) {
new Thread(() -> {
SocketChannel client =(SocketChannel) key.channel();
ByteBuffer buffer = (ByteBuffer) key.attachment();
buffer.flip();
while (buffer.hasRemaining()){
try {
//将buffer中的数据,写入到内核缓冲区send-Q,即调用内核函数write。
client.write(buffer);
} catch (IOException e) {
e.printStackTrace();
}
}
}).start();
}
}
多线程多路复用器的缺陷是:频繁地进行系统调用,调用epoll_ctl把fd加入空间(client.register())、调用epoll_ctl把fd从空间中移除(key.cancel())。
为了解决频繁进行epoll_ctl系统调用的问题,我们将文件描述符fds进行分组,每一组分配一个selector,同一组内的fds共享同一个线程。它们在线程内是串行,线程之间是并行。由于在线程内是串行的,所以不需要new一个线程来处理IO,也就不需要key.cancel,减少了系统调用。就算这个线程内的IO处理比较慢,也只是影响这个线程的fds,不影响其他线程的fds。这就是单reactor多线程模型,单reactor是指不管你是关心accept事件,还是关心read事件的文件描述符,它们都有可能分在一个线程中。多线程是指,一个文件描述符分组(或者一个selector)的业务线程是多线程,IO线程还是单线程。
进一步演进,我们把每一个关心accept事件的文件描述符,分到每一个单独的线程中。这样就是多reactor(主从reactor)多线程模型。多reactor是指,关心accept事件的fds在一个reactor(线程集合)中,其他的fds在另一个reactor(线程集合)中。多线程是指,一个文件描述符分组(或者一个selector)的业务线程是多线程,IO线程还是单线程。
3、单reactor多线程模型
单reactor多线程模型,如图所示。不要把reactor当成一个线程,应该把它当成一个线程集合(数组)。其中每一个线程,分配一个selector,一个selector又负责一组文件描述符,一组文件描述符fds中可以是关心accept事件的fd(图中的acceptor),也可以是关心read事件的fd(图中的read),也可以是关心write事件的fd(图中的send)。当一个IO线程中的IO事件处理完毕(比方说读取网络数据到内存中),就开启多线程(图中的线程池ThreadPool)处理业务请求
4、多reactor多线程模型
多reactor多线程模型,如图所示。它把每一个关心accept事件的文件描述符,分到每一个单独的线程中。这些线程集合就组成了图中的mainReactor。其他关心read事件、write事件的文件描述符依旧混在一起,分组到多个线程,这些线程集合就组成了图中的subReactor。当acceptor接收到新的连接时,它会生成一个关心read事件的fd,将它丢入到subReactor。
5、单reactor单线程模型
看完了演进版本,我们再来看看比较原始的reactor。单reactor单线程模型,它的最大不同是业务处理没有用线程池,而是与IO处理共用线程。可知它的效率是最低的。
6、netty reactor架构
netty的架构,就是多reactor多线程模型。其中Boss Group就是mainReactor,它是一个NioEventGroup,里面有一组NioEventLoop。Boss Group下面NioEventLoop就是acceptor,只关心accept事件。一旦有新的连接到达,会生成一个关心read事件的fd,将它丢入到Worker Group(丢入对应图中的注册channel到Selector)。Worker Group也就是subReactor,里面是一组NioEventLoop。Worker Group下面NioEventLoop既是read又是write,毕竟它里面有一大堆关心read和write事件的文件描述符呢。它的step1 select,表示询问文件描述符是否有就绪的read/write事件;如果有的话,就在step2 processSelectedKeys进行IO处理;至于step3 runAllTasks,是处理一些定时任务的,可以暂时忽略