sqoop导数到各个数据库，可以套用

最新推荐文章于 2020-12-22 14:20:33 发布

一切都是浮云啦

最新推荐文章于 2020-12-22 14:20:33 发布

阅读量387

点赞数

分类专栏： sqoop 文章标签： hive sqoop

本文链接：https://blog.csdn.net/weixin_46153092/article/details/110521946

版权

sqoop 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

一、测试数据库连接

 sqoop list-databases \
 --connect jdbc:oracle:thin:@ip:port:数据库 \
 --username aml \
 --password aml

二、代码样例

#!/bin/bash

while read line
do
  hdfs dfs -rmr /user/hive/warehouse/aml.db/$line
  echo "********** 开始导入表 $line ************"
  sqoop import \
  --connect jdbc:oracle:thin:@ip:port:数据库 \
  --username 用户名 --password 密码 \
  --query "select * from "$line" where 1=1 and \$CONDITIONS" \
  --hive-import  --delete-target-dir --hive-overwrite \
  --hive-database aml --hive-table $line \
  --target-dir /user/hive/warehouse/aml.db/$line \
  -null-string '\\N' -null-non-string '\\N' \
  -m 1
done < table_list.txt




--oracle 采集配置
sqoop import \                 # 导入模式    关系型数据库-> hdfs(hive)
--driver oracle.jdbc.driver.OracleDriver \            #驱动类
--connect jdbc:oracle:thin:@ip:port:数据库 \       #源数据库url
--username datacenter \               #账号
--password sjzx_ljp0411 \              #密码  
--query "select * from $source_tab_name where \$CONDITIONS and $where_str " \  #要采集的数据集
--target-dir /tmp/sqoop/${source_ID}/t_${tab_name}_tmp \      #hdfs的临时目录 可自动创建
--delete-target-dir \               #数据采集前是否删除目录并新建
--hive-import \                 #导入hive
--hive-overwrite \                #数据写入模式为覆盖
--null-string '\\N' \               #替换为String类型的null为\\n
--null-non-string '\\N' \              #替换为非String类型的null为\\n
--fields-terminated-by "\t" \             #指定hive表的分隔符
--hive-drop-import-delims \              #去除列值中 \n \r 等特殊字符
--hive-table ods_${source_ID}.t_${tab_name}_tmp \        #指定hive表名
--hive-partition-key bus_date \             #插入为分区表时分区字段
--hive-partition-value ${bus_date} \           #分区的值
-m 1                    #指定maptask数量 = 并行度
``

一切都是浮云啦

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
sqoop导数到各个数据库，可以套用

sqoop 从Oracle 导数据到 hive 中出现的问题，请问怎么解决#!/bin/bashwhile read linedohdfs dfs -rmr /user/hive/warehouse/aml.db/$lineecho “********** 开始导入表 KaTeX parse error: Undefined control sequence: \ at position 32: … sqoop import \̲̲ --connect jdb…line” where
复制链接

扫一扫