mysql与excel数据互转小需求

背景:将一个excel文档中几万条业务数据填充完整,所缺失的数据来自线上mysql,而我们恰好有线上备库相关表的查询权限。这并不是业务需求,只是帮业务人员一个小忙。


方法很简单,先从excel中读数据,再从mysql中读数据,再写入excel文档,没有技术难点。带着急速完成显摆一下的目的,搞清excel文档所缺失的数据来自哪张表哪个字段之后,立马打开intellij。


关键是完成的速度嘛,不管美观。读写xls,以前用过jxl,于是首先拿过来jxl用。


<dependency>
    <groupId>net.sourceforge.jexcelapi</groupId>
    <artifactId>jxl</artifactId>
    <version>2.6.12</version>
</dependency>
<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>5.1.27</version>
    <scope>compile</scope>
</dependency>

try {
    Workbook book = Workbook.getWorkbook(FetchChannelData2XLSXDocument.class.getResourceAsStream(XLS));
    Sheet sheet = book.getSheet(0);
    for (int i = 1; i < sheet.getRows(); i++) {
        Cell cell = sheet.getCell(0, i);
        String contents = cell.getContents();
        if (contents != null && contents.length() > 0) {
            keys.add(contents);
        }
    }
    book.close();
    System.out.println("读取xls成功:数据行数:" + (sheet.getRows() - 1) + ";map size:" + result.size());
    System.out.println("第一条:" + sheet.getCell(0, 1).getContents());
    System.out.println("第末条:" + sheet.getCell(0, sheet.getRows() - 1));

} catch (Exception e) {
    throw new RuntimeException("读取xls失败", e);
}



private static Connection getConnection() {
    try {
        Connection connection = DriverManager.getConnection(URL, USER, PASSWORD);
        log("***************************获取连接成功***********");
        return connection;
    } catch (SQLException e) {
        log("***************************获取连接失败:", e.getMessage());
        try {
            Thread.sleep(2000);
        } catch (InterruptedException e1) {
        }
        return getConnection();
    }
}

private static String query(Connection connection, String param) throws SQLException {
    PreparedStatement preparedStatement = connection.prepareStatement(SQL);
    preparedStatement.setString(1, param);
    ResultSet resultSet = preparedStatement.executeQuery();
    if (resultSet.next()) {
        return resultSet.getString(1);
    } else {
        return "";
    }
}

Connection connection = getConnection();
int nullCount = 0;
int emptyCount = 0;
for (int i = 0; i < keys.size(); i++) {
    String key = keys.get(i);
    String value = "null";
    try {
        value = query(connection, key);
    } catch (SQLException e) {
        log("查询失败", key, e.getMessage());
        connection = getConnection();
    }
    if ("null".equals(value)) {
        nullCount++;
    }
    if ("".equals(value)) {
        emptyCount++;
    }
    result.put(key, value);
    int count = QUERY_COUNT.incrementAndGet();
    if ((count % 1000) == 0) {
        log("正常加载过程", count);
    }
}

就这样写了起来,写完一运行,有问题。

抛错

Caused by: jxl.read.biff.BiffException: Unable to recognize OLE stream
at jxl.read.biff.CompoundFile.<init>(CompoundFile.java:116)
at jxl.read.biff.File.<init>(File.java:127)
at jxl.Workbook.getWorkbook(Workbook.java:268)
at jxl.Workbook.getWorkbook(Workbook.java:253)

原来是jxl这个版本支持xls 2003,一看没有与时俱进果断放弃了,网上找了apache poi,原来这个是专业的,不仅xls,ms-office的很多文档都支持,后边文档介绍下这个。


首先依赖:

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>3.10-FINAL</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>3.10-FINAL</version>
</dependency>
读xls

try {
    XSSFWorkbook workbook2007 = new XSSFWorkbook(FetchChannelData2XLSXDocument.class.getResourceAsStream(XLS));
    XSSFSheet sheet = workbook2007.getSheetAt(0);// 取出第一个工作表,索引是0
    // 开始循环遍历行,表头不处理,从1开始
    for (int i = 1; i <= sheet.getLastRowNum(); i++) {
        XSSFCell cell = sheet.getRow(i).getCell(0);// 获取单元格对象
        String contents = cell.getStringCellValue();
        if (contents != null && contents.length() > 0) {
            keys.add(contents);
        }
    }
    log("读取xls成功:数据行数:" + (sheet.getLastRowNum()) + ";map size:" + result.size());
    log("第一条:" + sheet.getRow(1).getCell(0).getStringCellValue());
    log("第末条:" + sheet.getRow(sheet.getLastRowNum()).getCell(0).getStringCellValue());
} catch (Exception e) {
    throw new RuntimeException("读取xls失败", e);
}
写xls

try {
    XSSFSheet sheet = workbook2007.getSheetAt(0);// 取出第一个工作表,索引是0
    int count = 0;
    int writeCount = 0;
    // 开始循环遍历行,表头不处理,从1开始
    for (int i = 1; i <= sheet.getLastRowNum(); i++) {
        XSSFCell write2cell = sheet.getRow(i).createCell(17);// 获取单元格对象
        String key = sheet.getRow(i).getCell(0).getStringCellValue();
        if (key != null && key.length() > 0) {
            count++;
            String value = result.get(key);
            if (value == null || "null".equals(value) || "".equals(value)) {
                continue;
            }
            if (CHANNEL_NAMES.containsKey(value)) {
                value = CHANNEL_NAMES.get(value);
            }
            write2cell.setCellValue(value);
            writeCount++;
        }
    }
    log("写入xls" + "总条数" + count + "有值写入" + writeCount);
    
    File file = new File(NEW_XLS_FILE_NAME);
    workbook2007.write(new FileOutputStream(file));
    log("保存成功:文件路径:", file.getAbsolutePath());

} catch (Exception e) {
    throw new RuntimeException("写入xls失败", e);
}

ok了,跑起来了。另一个问题,10万条左右数据,一条一条查询也太慢了,等得受不了,加个多线程吧。

int times = keys.size() / 10000 + 1;
final CountDownLatch latch = new CountDownLatch(times);
for (int i = 0; i < times; i++) {
    final int finalI = i;
    executor.execute(new Runnable() {
        @Override
        public void run() {
            Connection connection = getConnection();
            long startTime = System.currentTimeMillis();
            loadDataFromDB(connection, finalI * 10000, 10000);
            log(" cost", (System.currentTimeMillis() - startTime));
            latch.countDown();
        }
    });
}
try {
    latch.await();
} catch (InterruptedException e) {
    e.printStackTrace();
}
这样快多了,同事的需求完成了,前后也花了两三个小时,关键是apache poi的入手使用碰到了问题。

1.依赖jar包,不仅仅是poi还有,poi-ooxml。

poi-version-yyyymmdd.jar
用于操作.xls文件;依赖于commons-logging, commons-codec, log4j;


poi-scratchpad-version-yyyymmdd.jar
用于操作.ppt、.doc、.vsd、.pub、.msg文件;依赖于poi;
 
poi-ooxml-version-yyyymmdd.jar、poi-ooxml-schemas-version-yyyymmdd.jar
用于操作.xlsx、.pptx、docx文件;依赖于poi, dom4j,xmlbeans, stax-api-1.0.1;操作Excel主要是指ss包、xssf包;


2.改完xls的内容,要写回文档,才能有效。

workbook2007.write(new FileOutputStream(file));

apache poi是什么?怎么用?有现成的文档,贴出来,自己就不写了。

1. http://blog.sina.com.cn/s/blog_667ac0360102eaf8.html

2. http://poi.apache.org/

3. http://www.yiibai.com/apache_poi/



评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值