java提取csv,excel遇到的一些坑

最新推荐文章于 2022-11-26 11:24:23 发布

qq_27985045

最新推荐文章于 2022-11-26 11:24:23 发布

阅读量840

点赞数

文章标签： java excel csv

本文链接：https://blog.csdn.net/qq_27985045/article/details/100607458

版权

1.环境idea,jdk1.8

2.提取csv文件不可使用split(","),因为csv文件不一定使用，分割，并且有可能此种提取方式field会带有"", csv定义可自行百度；

提取csv文件建议使用opencsv包，可至maven仓库下载，亲测有效。

3提取excel内容可采用poi包，其间遇到一些坑，读取行内容的时候如果遇到cell为空时，可能会丢失读取的key, 即如果有10列内容，可能会丢失中间一列导致内容缩进，错位的问题，需使用

Cell cell = row.getCell(colIdx, Row.MissingCellPolicy.CREATE_NULL_AS_BLANK);来读取null内容

因为poi读取日期为数字，所以需要进行判断。如下

if (isCellDateFormatted(cell)) {

rowData.add(fmt.format(cell.getDateCellValue()));
复制代码
} else {

//如果为数字则采用以下方法处理，以下方法以修正读取excel浮点数丢失精度问题
rowData.add(getRealStringValueOfDouble(cell.getNumericCellValue()));
复制代码
}

//此处代码为 poi中提取，并增加时间中带有年月日无法判断为日期的处理

public static boolean isCellDateFormatted(Cell cell) {

if (cell == null) {
    return false;
    } else {
        boolean bDate = false;
        double d = cell.getNumericCellValue();
        if (isValidExcelDate(d)) {
            CellStyle style = cell.getCellStyle();
            if (style == null) {
                return false;
            }

        int i = style.getDataFormat();
        String f = style.getDataFormatString();

        if (style.getDataFormat() == 28 || style.getDataFormat() == 31) {
            return true;
        }

        f = f.replaceAll("[\"|\']", "").replaceAll(
                "[年|月|日|时|分|秒|毫秒|微秒]", "");
        bDate = DateUtil.isADateFormat(i, f);
    }

    return bDate;
}

}

//以下代码用于读取数字的时候处理返回数字的丢失精度的问题

private static String getRealStringValueOfDouble(Double d) {
		    String doubleStr = d.toString();
		    boolean b = doubleStr.contains("E");
		    int indexOfPoint = doubleStr.indexOf('.');
		    if (b) {
		        int indexOfE = doubleStr.indexOf('E');
		        BigInteger xs = new BigInteger(doubleStr.substring(indexOfPoint
		                + BigInteger.ONE.intValue(), indexOfE));
		        int pow = Integer.valueOf(doubleStr.substring(indexOfE
		                + BigInteger.ONE.intValue()));
		        int xsLen = xs.toByteArray().length;
		        int scale = xsLen - pow > 0 ? xsLen - pow : 0;
		        doubleStr = String.format("%." + scale + "f", d);
		    } else {
		        java.util.regex.Pattern p = Pattern.compile(".0$");
		        java.util.regex.Matcher m = p.matcher(doubleStr);
		        if (m.find()) {
		            doubleStr = doubleStr.replace(".0", "");
		        }
		    }
		    return doubleStr;
}

4.因生产场景需要，所以数据会用到数据结构linkedhashmap;数据读取进去时，如果linkedhashmap的key相同时，

map.put("key","value1");
map.put("key","value2");

value2将替换value1

5.因场景中是将数据落地到mongodb,mongodb中有一坑,低版本key不能带有.故需特殊处理

6.数据处理消费者为nodejs,用到promise.mapseries代替promise.map来防止并发插入数据产生的数据错乱问题。

qq_27985045

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
java提取csv,excel遇到的一些坑

1.环境idea,jdk1.82.提取csv文件不可使用split(","),因为csv文件不一定使用，分割，并且有可能此种提取方式field会带有"", csv定义可自行百度；提取csv文件建议使用opencsv包，可至maven仓库下载，亲测有效。3提取excel内容可采用poi包，其间遇到一些坑，读取行内容的时候如果遇到cell为空时，可能会丢失读取的key, 即如果有10列内容，可能...
复制链接

扫一扫