Sqoop常用命令及参数说明

最新推荐文章于 2022-09-14 15:56:06 发布

白开水v5

最新推荐文章于 2022-09-14 15:56:06 发布

阅读量1.8k

点赞数 1

分类专栏：大数据 sqoop 文章标签： sqoop

大数据同时被 2 个专栏收录

19 篇文章 0 订阅

订阅专栏

sqoop

1 篇文章 0 订阅

订阅专栏

转载自：https://www.aliyun.com/jiaocheng/1106363.html

列出mysql数据库中的所有数据库中的test数据库

sqoop list-databases --connect jdbc:mysql://localhost:3306/test -usernametest -passwordtest

连接mysql并列出数据库中的表

sqoop list-tables --connect jdbc:mysql://localhost:3306/test --username test --password test

将关系型数据的表结构复制到hive中

sqoop create-hive-table --connect jdbc:mysql://localhost:3306/test --table users --username test
--password test --hive-table users  --fields-terminated-by "/0001"  --lines-terminated-by "/n";

参数说明:

--fields-terminated-by "/0001" 是设置每列之间的分隔符,"/0001"是ASCII码中的1,它也是hive的默认行内分隔符, 而sqoop的默认行内分隔符为","

--lines-terminated-by "/n" 设置的是每行之间的分隔符,此处为换行符,也是默认的分隔符;

注意:只是复制表的结构,表中的内容没有复制

将数据从关系数据库导入文件到hive表中

sqoop import --connect jdbc:mysql://localhost:3306/test --username test --password test 
--table users --hive-import --hive-table users -m 2 --fields-terminated-by "/0001";

参数说明:

-m 2 表示由两个map作业执行;

--fields-terminated-by "/0001" 需同创建hive表时保持一致;

将hive中的表数据导入到mysql数据库表中

sqoop export --connect jdbc:mysql://192.168.66.66:3306/test --username test --password test 
--table users --export-dir /user/hive/warehouse/users/part-m-00000 --input-fields-terminated-by '/0001'

注意:

a. 在进行导入之前,mysql中的表users必须已经提起创建好了。

b. jdbc:mysql://192.168.66.66:3306/test中的IP地址改成localhost会报异常

将数据从关系数据库导入文件到hive表中,--query 语句使用

sqoop import --append --connect jdbc:mysql://192.168.66.66/test --username test --password test --query "select id,age,name from userinfo where /$CONDITIONS"  -m 1  --target-dir /user/hive/warehouse/userinfos2 --fields-terminated-by ",";

将数据从关系数据库导入文件到hive表中,--columns --where 语句使用

sqoop import --append --connect jdbc:mysql://192.168.66.66:3306/test --username test --password test --table userinfo --columns "id,age,name"  --where "id > 3 and (age = 88 or age = 80)"  -m 1  --target-dir /user/hive/warehouse/userinfos2 --fields-terminated-by ",";

注意:--target-dir /user/hive/warehouse/userinfos 可以用 --hive-import --hive-table userinfos进行替换

mysql导入数据到HDFS

sqoop import --connect jdbc:mysql://192.168.66.66:3306/test --username root --password 123456 --table test_user --target-dir  /usr/test -m 2 --fields-terminated-by "/t" --columns "id,name"  --where 'id>2 and id<=6'

import主要参数

--connect <jdbc-uri>    jdbc连接地址
--connection-manager <class-name>     连接管理者
--driver <class-name>     驱动类
--hadoop-mapred-home <dir>     $HADOOP_MAPRED_HOME
--help     help信息
-P     从命令行输入密码
--password <password>     密码
--username <username>     账号
--verbose    打印信息
--connection-param-file <filename>  可选参数

--append     添加到hdfs中已经存在的dataset
--as-avrodatafile     导入数据作为avrodata
--as-sequencefile     导入数据位SequenceFiles
--as-textfile          默认导入数据为文本
--boundary-query <statement>     创建splits的边界
--columns <col,col,col…>     选择列
--direct             使用直接导入快速路径
--direct-split-size <n>     在快速模式下每n字节使用一个split
--fetch-size <n>     一次读入的数量
--inline-lob-limit <n>     最大数值 an inline LOB
-m,--num-mappers <n>     通过实行多少个map,默认是4个,某些数据库8 or 16性能不错
-e,--query <statement>     通过查询语句导入
--split-by <column-name>     创建split的列,默认是主键
--table <table-name>     要导入的表名
--target-dir <dir>     HDFS 目标路径
--warehouse-dir <dir>     HDFS parent for table destination
--where <where clause>     where条件
-z,--compress     Enable compression
--compression-codec <c>     压缩方式,默认是gzip
--null-string <null-string>    字符列null值
--null-non-string <null-string>     非字符列null值

export主要参数

--direct     快速导入
--export-dir <dir>     HDFS到处数据的目录
-m,--num-mappers <n>     都少个map线程
--table <table-name>     导出哪个表
--call <stored-proc-name>     存储过程
--update-key <col-name>     通过哪个字段来判断更新
--update-mode <mode>     插入模式,默认是只更新,可以设置为allowinsert.
--input-null-string <null-string>     字符类型null处理
--input-null-non-string <null-string>     非字符类型null处理
--staging-table <staging-table-name>     临时表
--clear-staging-table                     清空临时表
--batch                                     批量模式

转义字符相关参数

Argument     Description
--enclosed-by <char>     设置字段结束符号
--escaped-by <char>     用哪个字符来转义
--fields-terminated-by <char>     字段之间的分隔符
--lines-terminated-by <char>     行分隔符
--mysql-delimiters             使用mysql的默认分隔符: , lines: /n escaped-by: / optionally-enclosed-by: '
--optionally-enclosed-by <char>     复制结束符

hive导入参数

--hive-home <dir>  重写$HIVE_HOME
--hive-import          插入数据到hive当中,使用hive的默认分隔符
--hive-overwrite  重写插入
--create-hive-table  建表,如果表已经存在,该操作会报错!
--hive-table <table-name>  设置到hive当中的表名
--hive-drop-import-delims  导入到hive时删除 /n, /r, and /01 
--hive-delims-replacement  导入到hive时用自定义的字符替换掉 /n, /r, and /01 
--hive-partition-key          hive分区的key
--hive-partition-value <v>  hive分区的值
--map-column-hive <map>          类型匹配,sql类型对应到hive类型

白开水v5

关注

1
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
Sqoop常用命令及参数说明

转载自：https://www.aliyun.com/jiaocheng/1106363.html列出mysql数据库中的所有数据库中的test数据库sqoop list-databases --connect jdbc:mysql://localhost:3306/test -usernametest -passwordtest连接mysql并列出数据库中的表sqoop lis...
复制链接

扫一扫

专栏目录