NIO超过1亿个字符解析_nio 解析大结果-CSDN博客

本文链接：https://blog.csdn.net/zy1404/article/details/116305729

本文介绍了一种处理超过1亿字符的单行JSON数组的策略，利用NIO的ByteBuffer和CharBuffer进行读取和解析，避免了内存溢出和字符串长度超出int限制的问题。通过预估缓存池，逐个解析并写入新的JSON行，确保了大数据量文件的处理可行性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

遇到一个json 文件，文件只有一行，里面内容是json 数组的形式，形如：

[{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"},{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"},{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"}]

只有一行，按行读取，字符数组长度超过 1亿

超过1.4亿个字符，生产没法保证后续是否会继续扩大如果字符串长度超过 int的最大值那么读取将会报错

或者内存没法存储一行数据也会报错，

思路：

预估一个缓存池，读取到缓存，自己解析

目前一个json 串长度约500个字符，因此可以借用 NIO 进行解析代码如下：

        String path ="/Users/wuss/Downloads/20210428/Rooms_2021-04-28.json";
        String wPath ="/Users/wuss/Downloads/20210428/Rooms_2021-04-28-new.json";

        FileChannel fileChannel = FileChannel.open(Paths.get(path), StandardOpenOption.READ);

        BufferedWriter bw = new BufferedWriter(new FileWriter(wPath));
        int size = 2048;

        ByteBuffer byteBuffer = ByteBuffer.allocate(size);
        //避免极端情况，能容下多份 byteBuffer
        CharBuffer charBuffer = CharBuffer.allocate(size <<1);

        // 通过设置字符集的编码，并将ByteBuffer转换为CharBuffer来避免中文乱码
        Charset charset = Charset.forName("UTF-8");
        CharsetDecoder decoder = charset.newDecoder();

        String string="";
        boolean start =false;

        while (-1 != fileChannel.read(byteBuffer)){
            byteBuffer.flip();
            decoder.decode(byteBuffer, charBuffer, byteBuffer.limit() < size);

            charBuffer.flip();
            char[] chars = new char[charBuffer.limit()];
            charBuffer.get(chars);

            string = String.valueOf(chars);
            int index;
            while ((index = string.indexOf("},{"))> 0){
                if (!start){
                    bw.write(string.substring(1, index + 1));
                    start = true;
                }else {
                    bw.write(string.substring(0, index + 1));
                }

                bw.newLine();
                string = string.substring(index+2);
            }
            charBuffer.clear();
            //存放未解析的数据
            charBuffer.put(string);
            
            byteBuffer.compact();
        }
        if (string.length()>0){
            bw.write(string.substring(0,string.lastIndexOf(']')));
        }

        fileChannel.close();
        bw.close();

哪怕一行数据超大，用此种思路也借鉴