关于大数据T+1执行流程
前提: 搭建好大数据环境(hadoop hive hbase sqoop zookeeper oozie hue)
1.将所有数据库的数据汇总到hive (这里有三种数据源 ORACLE MYSQL SEQSERVER)
全量数据抽取示例:
ORACLE(注意表名必须大写!!!)
sqoop import --connect jdbc:oracle:thin:@//10.11.22.33:1521/LPDR.china.com.hh --username root --password 1234 \
--table DATABASENAME.TABLENAME --hive-overwrite --hive-import --hive-database bgda_hw --hive-table lp_tablename \
--target-dir /user/hadouser_hw/tmp/lp_tablename --delete-target-dir \
--null-non-string '\\N' --null-string '\\N' \
--hive-drop-import-delims --verbose --m 1
MYSQL:
sqoop import --connect jdbc:mysql://10.33.44.55:3306/DATABASEBANE --username ROOT --password 1234 \
--query 'select * from DEMO t where t.DATE1 < current_date and $CONDITIONS' \
--hive-overwrite --hive-import --hive-database bgda_hw --hive-table DEMO \
--target-dir /user/hadouser_hw/tmp/DEMO --delete-target-dir \
--null-non-string '\\N' --null-string '\\N' \
--hive-drop-import-delims --verbose --m 1
SQLSERVER:
sqoop import --connect 'jdbc:sqlserver://10.55.66.15:1433;username=ROOT;password=ROOT;database=db_DD' \
--query 'select * from TABLE t where t.tasktime < convert(varchar(10),getdate(),120) and $CONDITIONS' \
--hive-overwrite --hive-import --hive-database bgda_hw --hive-table TABLENAME \
--target-dir /user/hadouser_hw/tmp/TABLENAME --delete-target-dir \
--null-non-string '\\N' --null-string '\\N' \
--hive-drop-import