NIO超过1亿个字符解析

2 篇文章 0 订阅

遇到一个json 文件,文件只有一行,里面内容是json 数组的形式,形如:

[{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"},{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"},{"CreateTime":"2021-04-27T10:02:38.243","UpdateTime":null,"SupplierCode":"xxx","SupplierName":"xxxx","HotelId":"xxxxx","RoomCode":"xxxx"}]

只有一行,按行读取,字符数组长度超过 1亿

 

超过1.4亿个字符,生产没法保证后续是否会继续扩大如果字符串长度超过 int的最大值那么读取将会报错

或者内存没法存储一行数据也会报错,

思路:

预估一个缓存池,读取到缓存,自己解析

目前一个json 串长度约500个字符,因此可以借用 NIO 进行解析代码如下:

        String path ="/Users/wuss/Downloads/20210428/Rooms_2021-04-28.json";
        String wPath ="/Users/wuss/Downloads/20210428/Rooms_2021-04-28-new.json";

        FileChannel fileChannel = FileChannel.open(Paths.get(path), StandardOpenOption.READ);

        BufferedWriter bw = new BufferedWriter(new FileWriter(wPath));
        int size = 2048;

        ByteBuffer byteBuffer = ByteBuffer.allocate(size);
        //避免极端情况,能容下多份 byteBuffer
        CharBuffer charBuffer = CharBuffer.allocate(size <<1);

        // 通过设置字符集的编码,并将ByteBuffer转换为CharBuffer来避免中文乱码
        Charset charset = Charset.forName("UTF-8");
        CharsetDecoder decoder = charset.newDecoder();

        String string="";
        boolean start =false;

        while (-1 != fileChannel.read(byteBuffer)){
            byteBuffer.flip();
            decoder.decode(byteBuffer, charBuffer, byteBuffer.limit() < size);

            charBuffer.flip();
            char[] chars = new char[charBuffer.limit()];
            charBuffer.get(chars);

            string = String.valueOf(chars);
            int index;
            while ((index = string.indexOf("},{"))> 0){
                if (!start){
                    bw.write(string.substring(1, index + 1));
                    start = true;
                }else {
                    bw.write(string.substring(0, index + 1));
                }

                bw.newLine();
                string = string.substring(index+2);
            }
            charBuffer.clear();
            //存放未解析的数据
            charBuffer.put(string);
            
            byteBuffer.compact();
        }
        if (string.length()>0){
            bw.write(string.substring(0,string.lastIndexOf(']')));
        }

        fileChannel.close();
        bw.close();
        

 

哪怕一行数据超大,用此种思路也借鉴

 

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值