java读取文件内容时解析每行字符串String指定字节位置的数据

最新推荐文章于 2021-10-04 23:29:17 发布

Alex_Jwell

最新推荐文章于 2021-10-04 23:29:17 发布

阅读量1k

点赞数

分类专栏： java 文章标签： byte 解析文件内容

本文链接：https://blog.csdn.net/weixin_43652442/article/details/103102420

版权

java 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

需求：在读取文件时，需要获取文件内容，且需要解析每行数据中指定字节位置的数据。例如文件前十位字节为姓名,后八位为生日，后一位为性别。

原计划：使用FileReader、BufferedReader逐行获取文件内容，然后再用String.substring()方法按照规则进一步解析每行数据的具体字段。

代码如下

    	File file = new File("D:/1.txt");
    	BufferedReader br = new BufferedReader(new FileReader(file));
    	String tmpStr = null;
    	while ((tmpStr=br.readLine())!=null) {
			String name = tmpStr.substring(0, 10);
			String birthday = tmpStr.substring(10, 18);
			String sex = tmpStr.substring(18, 19);
			System.out.println(name+";"+birthday+";"+sex);
		}

问题：解析数据时，使用String.substring()方法存在问题。即当数据中存在汉字时,一个汉字在substring方法中视为1位,而按照字节解析时并不是。

改进方案：将每一行字符串先解析成byte数组,然后再按照位置解析

代码如下

    	File file = new File("D:/1.txt");
    	BufferedReader br = new BufferedReader(new FileReader(file));
    	String tmpStr = null;
    	while ((tmpStr=br.readLine())!=null) {
    		byte[] tmpByte = tmpStr.getBytes(Charset.forName("GBK"));
			String name = new String(tmpByte,0,10,Charset.forName("GBK"));
			String birthday = new String(tmpByte,10, 8,Charset.forName("GBK"));
			String sex = new String(tmpByte,18, 1,Charset.forName("GBK"));
			System.out.println(name+";"+birthday+";"+sex);
		}

使用到的方法：public String(byte[] bytes, int offset, int length, String charsetName)。

相关参数说明：bytes - 要解码为字符的 byte；offset - 要解码的第一个 byte 的索引；length - 要解码的 byte 数；charsetName - 受支持 charset 的名称