大数据量查询结果的文本导出

最新推荐文章于 2021-06-21 11:53:21 发布

zhongweij

最新推荐文章于 2021-06-21 11:53:21 发布

阅读量204

点赞数

分类专栏： java 文章标签： F#

java 专栏收录该内容

153 篇文章 0 订阅

订阅专栏

FileOutputStream fo = null;
 BufferedReader in = null;
 FileOutputStream f = null;
 ZipOutputStream zipout = null;
 FileInputStream inStream = null;
 ServletOutputStream outStream = null;
 
 HttpServletResponse response = request.getResponse();
 try {
 response.setContentType("text/plain;charset=GB2312");
 if (export.equals("txt")) {
 response.setHeader("Content-disposition",
 "attachment; filename="
 + new String("Query".getBytes("UTF-8"),
 "iso8859-1") + ".zip"); //导出zip压缩文件
 } else {
 response.setHeader("Content-disposition",
 "attachment; filename="
 + new String("Query".getBytes("UTF-8"),
 "iso8859-1") + ".zip");
 }
 outStream = response.getOutputStream();
 
 CheckedOutputStream ch = new CheckedOutputStream(outStream,
 new CRC32()); 
 zipout = new ZipOutputStream(new BufferedOutputStream(ch)); //把数据经过压缩直接写到response里面，避免中间文件生成，提高效率
 
 if (export.equals("txt")) { //压缩包文件格式
 zipout.putNextEntry(new ZipEntry("Query.txt")); 
 }else{
 zipout.putNextEntry(new ZipEntry("Query.xls"));
 }
 
 int pageSize = endIndex > 100000 ? 100000 : endIndex;
 if (endIndex == 1) {
 endIndex = 2;
 } 
 for (int i = startIndex; i < endIndex; i += pageSize) { //对于大数据量的导出，采取分批查询的方式，这里每次查询取100000条记录
 paraMap.put("startIndex", i);
 paraMap.put("endIndex", i + pageSize - 1);
 doExport(paraMap, zipout); //在这个函数中把从数据库查询出来的数据直接写到zipoutStream输出流中，
 zipout.flush();
 outStream.flush(); //每100000条就把response缓冲区数据推送到客户端，response可以使用flush将数据库缓存到客户端，减少服务器端内存压力，
 }
 zipout.close();
 outStream.close(); //关闭response
 } catch (Exception e) {
 // e.printStackTrace();
 } finally {
 }

对于大数据量导出，主要采取几种方案：

1.分批查询数据库，如果数据量上百万，如果一次性读出来，会严重影响服务器性能，增加服务器内存压力，分批导出，分批处理，可以使内存使用比较平稳，不会一次性占用大量内存。

2.压缩处理，导出的数据经过压缩直接写到response里面，一般的文本压缩率非常高，压缩后的文件会小很多，而且也不要先把数据写到临时文件，再做压缩，直接压缩写到response即可。

doExport(paraMap, zipout); //在这个函数中把从数据库查询出来的数据直接写到zipoutStream输出流中，这里往zipout写要这样，zipout.write("hello".getBytes());，要以字节流写到zipoutstream中。

3. zipout.flush();
outStream.flush(); //每100000条就把response缓冲区数据推送到客户端，response可以使用flush将数据库缓存到客户端，减少服务器端内存压力，

分批查询数据库把数据写到response时要用flush,将数据缓冲到客户端，这样有两个好处；一是可以把数据推送到客户端，减少服务器端内存压力，避免影响服务器性能；二是如果本次请求需要查询出几百万的数据，如果不flush，那么服务器端的本次请求仍然会去数据库查询这几百万的数据，这会严重的影响服务器端的性能，从数据库查询出这几百万的数据，而客户端已经取消了本次请求，这样就不能再让服务器继续处理下去，使用flush可以检测客户端的请求是否还存在，不存在会抛出异常，用flush可以捕获到该异常及时中止本次请求，不再去查询数据库，不再处理本次请求。