sqoop作为从关系型数据库导入hdfs的工具,我们在从关系型数据库中导出数据时可先做数据筛选,选定我们所要的数据,能大大的减轻数据负担,即sql语句后加where条件的实现!
经测试可执行sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --table XXX --columns "columns" --where " c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
原测试sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --query ‘select columns from talbename where 1=1 and $CONDTIONS' --where " c1>=to_date('
经测试可执行sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --table XXX --columns "columns" --where " c1>=to_date('2015-01-01 00:00:00','yyyy-mm-dd hh24:mi:ss') and c1<=to_date('2015-02-01 00:00:00','yyyy-mm-dd hh24:mi:ss') " -m 8 --split-by ID --fields-terminated-by '^' --target-dir /importdata/XXX/
原测试sqoop脚本如下:
sqoop import --connect jdbc:oracle:thin:@IP:PORT:SCHEMA --username username -password=password --query ‘select columns from talbename where 1=1 and $CONDTIONS' --where " c1>=to_date('