MappedByteBuffer介绍及详细解读（java操作大文件多种方法）

傻鱼爱编程

已于 2022-08-02 21:54:06 修改

阅读量1.2w

点赞数 14

文章标签： java 开发语言后端 idea c语言

于 2021-10-23 13:24:02 首次发布

本文链接：https://blog.csdn.net/m0_57640408/article/details/120919806

版权

MappedByteBuffer是ByteBuffer的子类。
以前我们操作大文件都是用BufferedInputStream、BufferedOutputStream等带缓冲的IO流处理。
现在我们讲一下java nio中一种基于MappedByteBuffer操作大文件的方式，读写性能极高。

在讲之前我们先了解一点关于内存的知识：
物理内存: 就是内存条的内存空间。
虛拟内存: 是计算机系统内存管理的一种技术。它使得应用程序认为它拥有连续的可用的内存(一个连续完整的地址空间)，而实际上，它通常是被分隔成多个物理内存碎片,还有部分暂时存储在外部磁盘存储器上,在需要时进行数据交换。目前，大多数操作系统都使用了虚拟内存,如 Windows家族的“虚拟内存”; Linux的“交换空间”等。

MappedByteBuffer采用direct buffer的方式读写文件内容,这种方式就是内存映射。这种方式直接调用系统底层的缓存,没有JVM和系统之间的复制操作，所以效率非常高。主要用于操作大文件。

下面我们开始说是如何使用：
MappedByteBuffer没有构造函数(不可new MappedByteBuffer ( )来构造一个MappedByteBuffer( )，我们需要借助FileChannel提供的map方法把文件映射为MappedByteBuffer-->MappedByteBuffer map(int mode, long position, long size)；其实就是Map把文件的内容被映像到计算机虚拟内存的一块区域，这样就可以直接操作内存当中的数据而无需操作的时候每次都通过IO去物理硬盘读取文件，所以效率高。

参数 int mode的三种写法：
1、MapMode. READ_ ONLY(只读)
2、MapMode. READ_WRITE(读/写)
3、MapMode PRIVATE
long position和 long size：把文件的从position开始的size大小的区域映射为内存映像文件。

MappedByteBuffer比 ByteBuffer多的三个方法：
1、fore( )缓冲区是READ_WRITE模式下，此方法对缓冲区内容的修改强行写入文件
2、load( )将缓冲区的内容载入内存,并返回该缓冲区的引用
3、isloaded( )如果缓冲区的内容在物理内存中，则返回真，否则返回假。

如果只需要读时可以使用FileInputStream，写映射文件时一定要使用随机( RandomAccessFile)访问文件。

下面的基本使用，创建了一个128Mb的文件，如果一次性读到内存可能导致内存溢出，这里访问好像只是一瞬间的事，这是因为真正调入内存的只是其中的一小部分，其余部分则被放在交换文件上。这样你就很方便地修改超大型的文件了(最大可以到2GB，基本上超过1.5G就可以考恵使用分块操作了)。Java是调用操作系统的“文件映射机制(file- mapping facility)”来提升性能的。如果是操作小文件，就用基本的O就可以了( FileInputStream，FileOutputStream)。

MappedByteBuffer的基本使用:

public class MappedByteBuffer基本应用 {
    static int length = 0x8000000; // 128 Mb 一个bit占1B，0x8000000换成十进制为：134217728
    public static void main(String[] args) throws Exception {
        // 为了以可读可写的方式打开文件，我们使用RandomAccessFile来创建文件
        FileChannel fc = new RandomAccessFile("D:/TEST/test3.txt", "rw").getChannel();
        //文件通道的可读可写要建立在文件流本身可读写的基础之上
        MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_WRITE, 0, length);
        //写128M的内容*（a）
        for (int i = 0; i < length; i++) {
            mbb.put((byte) 'a');
        }
        System.out.println("writing end");
        //读取文件中间20个字节内容
        for (int i = length / 2; i < length / 2 + 20; i++) {
            System.out.print((char) mbb.get(i));
        }
        fc.close();
    }
}

MappedByteBuffer与io效率对比:

package com.itheima.springboot_day01_2;

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;

/**
 * 测试MappedByte和lO的效率对比
 */
public class MappedByteBuffer与io效率对比 {
    static int length = 0x8000000; // 128 Mb 一个bit占1B，0x8000000换成十进制为：134217728

    public static void main(String[] args) throws Exception {
        long start = System.currentTimeMillis();
        // 调用普通io
        testIO();
        // 调用MappedByteBuffer
        testMappedByteBuffer();
        // 调用 testFileChannel()
        testFileChannel();
        // 调用 testFileChannelByteBuffer()
        testFileChannelByteBuffer();
        long end = System.currentTimeMillis();
        System.out.println("耗时=" + (end - start) + "ms");
        /**
         * 调用 testIO()打印内容：耗时=6218ms;
         * 调用 testMappedByteBuffer()打印内容：耗时=2132ms
         * 调用 testFileChannel()打印内容：耗时=703ms
         * 调用 testFileChannelByteBuffer()打印内容：耗时=819ms
         */
    }

    /**
     * 测试0.9G文件，IO的效率
     *
     * @throws IOException
     */
    private static void testIO() throws IOException {
        File sourceFile = new File("D:/TEST/testFile0.9G文件.zip");
        byte[] bytes = new byte[1024];  // 和下面方式创建byte[]效率基本一样
//        byte[] bytes = new byte[(int) sourceFile.length()];
        FileInputStream fis = new FileInputStream(sourceFile);
        FileOutputStream fos = new FileOutputStream("D:/TEST/0.9G文件.zip");
        int len = -1;
        while ((len = fis.read(bytes)) != -1) {
            fos.write(bytes, 0, len); // 写入数据
        }
        fis.close();
        fos.close();
    }

    /**
     * 测试0.9G文件，MappedByteBuffer的效率
     *
     * @throws IOException
     */
    private static void testMappedByteBuffer() throws IOException {
        File sourceFile = new File("D:/TEST/testFile0.9G文件.zip");
//        byte[] bytes = new byte[1024];  // 和下面方式创建byte[]效率基本一样
        byte[] bytes = new byte[(int) sourceFile.length()];
        RandomAccessFile ra_read = new RandomAccessFile(sourceFile, "r");
        FileChannel fc = new RandomAccessFile("D:/TEST/0.9G文件.zip", "rw").getChannel();
        MappedByteBuffer map = fc.map(FileChannel.MapMode.READ_WRITE, 0, sourceFile.length());
        int len = -1;
        while ((len = ra_read.read(bytes)) != -1) {
            map.put(bytes, 0, len); // 写入数据
        }
        ra_read.close();
        fc.close();
    }

    /**
     * 测试0.9G文件，FileChannel的效率
     *
     * @throws IOException
     */
    private static void testFileChannel() throws IOException {
        File sourceFile = new File("D:/TEST/testFile0.9G文件.zip");
        FileInputStream fis = new FileInputStream(sourceFile);
        FileChannel fisChannel = fis.getChannel();
        FileOutputStream fos = new FileOutputStream("D:/TEST/0.9G文件.zip");
        FileChannel fosChannel = fos.getChannel();
        fisChannel.transferTo(0, fisChannel.size(), fosChannel);
        fis.close();
        fos.close();
    }

    /**
     * 测试0.9G文件，FileChannel的效率
     *
     * @throws IOException
     */
    private static void testFileChannelByteBuffer() throws IOException {
        try (FileChannel from = new FileInputStream("D:/TEST/testFile0.9G文件.zip").getChannel();
             FileChannel to = new FileOutputStream("D:/TEST/0.9G文件.zip").getChannel();
        ) {
            ByteBuffer bb = ByteBuffer.allocateDirect(1024 * 1024);
            while (true) {
                int len = from.read(bb);
                if (len == -1) {
                    break;
                }
                bb.flip();  // 调用flip之后，读写指针指到缓存头部，切换成读模式
                to.write(bb);
                bb.clear();  // 切换成写模式
            }
        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

调用 testIO( )打印内容：耗时=6218ms;
调用 testMappedByteBuffer( )打印内容：耗时=2132ms
总结:利用 FileChannel获取MappedByteBuffer操作大文件效率明显高于普通IO流。文件过大时会报错( Exception in threadmainjava. lang Negative Array Size Exception)，遇到报错就要做文件分片、分块了。

注意：要先分块才能使用MappedByteBuffer写操作。MappedByteBuffer其实就是文件映射，不能把一个大文件用MappedByteBuffer进行分块。