使用kettle spoon etl 工具将oracle中大表抽取到excel中
因为数据库表中有clob字段,整个表数据量500万左右,表大小有5.3G,
根据业务需求将数据准备到另外新的表中,create table * as select ***。这样kettle查询速度快点,避免一堆关联查询。
然后采用本地kettle 编辑好脚本,测试发现本地抽取到excel效率
只有40条/秒
果断放到服务器上面跑kettle的脚本。喜由天降,每分钟4w条的速度还是可以接受的。160w数据轻松导出!
但是有遇到一些问题,数据导出超过65536条就报下面错误:
[root@hz_gg57 dmb]# sh run_nielsen.sh
run_nielsen.sh: line 1: #!/bin/sh: No such file or directory
2016/07/08 12:54:54 - Kitchen - Start of run.
2016/07/08 12:54:54 - RepositoriesMeta - Reading repositories XML file: /home/data-integration/.kettle/repositories.xml
2016/07/08 12:54:57 - JOB_export_nielsen_info - Start of job execution
2016/07/08 12:54:57 - JOB_export_nielsen_info - Starting entry [export_nielsen_info]
2016/07/08 12:54:58 - Trans_export_nielsen_info - Dispatching started for transformation [Trans_export_nielsen_info]
2016/07/08 12:55:38 - 获取nielsen_info数据.0 - linenr 50000
2016/07/08 12:55:38 - Excel输出.0 - linenr 50000
2016/07/08 12:55:50 - Excel输出.0 - ERROR (version 5.1.0.0, build 1 from 2014-06-19_19-02-57 by buildguy) : Error writing field (0,65536) : jxl.write.biff.RowsExceededException: The maximum number of rows permitted on a worksheet been exceeded
2016/07/08 12:55:50 - Excel输出.0 - ERROR (version 5.1.0.0, build 1 from 2014-06-19_19-02-57 by buildguy) : jxl.write.biff.RowsExceededException: The maximum number of rows permitted on a worksheet been exceeded
2016/07/08 12:55:50 - Excel输出.0 - at jxl.write.biff.WritableSheetImpl.getRowRecord(WritableSheetImpl.java:1214)
2016/07/08 12:55:50 - Excel输出.0 - at jxl.write.biff.WritableSheetImpl.addCell(WritableSheetImpl.java:1151)