Sqoop : Hadoop 平台(Hdfs/Hive)与关系型数据库的数据交换/同步工具
官方介绍:Apache Sqoop(TM) is a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.
Sqoop 版本:1.4.6
import :
${SQOOP_HOME}/bin/sqoop import
--connect "jdbc:mysql://%s:3306/%s?characterEncoding=UTF-8&tinyInt1isBit=false"
--username %s
--password '%s'
--hive-database %s
--hive-table %s
--table %s
--lines-terminated-by '\n'
--fields-terminated-by '\001'
--hive-drop-import-delims
--hive-import
--null-string '\\N'
--null-non-string '\\N'
--delete-target-dir
--hive-overwrite
-m 1
说明:
- ‘\001’ 为 Hive 默认的字段分隔符
- Hive 表自动创建
- 当前为覆盖模式
export:
${SQOOP_HOME}/bin/sqoop
export --connect 'jdbc:mysql://%s:3306/%s?useUnicode=true&characterEncoding=utf-8'
--username %s
--password %s
--table %s
-input-fields-terminated-by '\t'
--export-dir %s //hdfs dir of hive table
--input-null-string '\\N'
--input-null-non-string '\\N'
说明:
- 同步前需建好Mysql表
- 当前为追加模式
Reference: