Sqoop 题:
-
在 master 和 slaver 节点安装 Sqoop Clients,完成后,在 master 节点查看Sqoop 的版本信息。
[root@master ~]#sqoop-version
Warning: /usr/hdp/2.6.1.0-129/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/05/09 12:08:48 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.1.0-129
Sqoop 1.4.6.2.6.1.0-129
git commit id 99af1205a99646445a9c7254ad2770706e1cc6a4
Compiled by jenkins on Wed May 31 03:22:43 UTC 2017 -
使用 Sqoop 工具列出 master 节点中 MySQL 中所有数据库。
[root@master ~]#sqoop list-databases --connect jdbc:mysql://localhost --username root --password bigdata
Warning: /usr/hdp/2.6.1.0-129/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/05/09 12:13:51 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.1.0-129
19/05/09 12:13:51 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/05/09 12:13:51 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
ambari
hive
mysql
performance_schema
test -
使用Sqoop工具列出master 节点中MySQL中ambari 数据库中所有的数据表。
[root@master ~]#sqoop list-tables --connect jdbc:mysql://localhost/ambari --username root --password bigdata
Warning: /usr/hdp/2.6.1.0-129/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
19/05/09 12:18:26 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.1.0-129
19/05/09 12:18:26 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/05/09 12:18:26 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
ClusterHostMapping
QRTZ_BLOB_TRIGGERS
QRTZ_CALENDARS
QRTZ_CRON_TRIGGERS
QRTZ_FIRED_TRIGGERS
QRTZ_JOB_DETAILS
QRTZ_LOCKS
QRTZ_PAUSED_TRIGGER_GRPS
QRTZ_SCHEDULER_STATE
QRTZ_SIMPLE_TRIGGERS
QRTZ_SIMPROP_TRIGGERS
QRTZ_TRIGGERS
adminpermission
adminprincipal
adminprincipaltype
adminprivilege
adminresource
adminresourcetype
alert_current
alert_definition
alert_group
alert_group_target
alert_grouping
alert_history
alert_notice
alert_target
alert_target_states
ambari_operation_history
ambari_sequences
artifact
blueprint
blueprint_configuration
blueprint_setting
clusterconfig
clusters
clusterservices
clusterstate
confgroupclusterconfigmapping
configgroup
configgrouphostmapping
execution_command
extension
extensionlink
groups
host_role_command
host_version
hostcomponentdesiredstate
hostcomponentstate
hostconfigmapping
hostgroup
hostgroup_component
hostgroup_configuration
hosts
hoststate
kerberos_descriptor
kerberos_principal
kerberos_principal_host
key_value_store
members
metainfo
permission_roleauthorization
remoteambaricluster
remoteambariclusterservice
repo_version
request
requestoperationlevel
requestresourcefilter
requestschedule
requestschedulebatchrequest
role_success_criteria
roleauthorization
servicecomponent_version
servicecomponentdesiredstate
serviceconfig
serviceconfighosts
serviceconfigmapping
servicedesiredstate
setting
stack
stage
topology_host_info
topology_host_request
topology_host_task
topology_hostgroup
topology_logical_request
topology_logical_task
topology_request
upgrade
upgrade_group
upgrade_history
upgrade_item
users
viewentity
viewinstance
viewinstancedata
viewinstanceproperty
viewmain
viewparameter
viewresource
viewurl
widget
widget_layout
widget_layout_user_widget
4.在 MySQL 中创建名为 xiandian 的数据库,在 xiandian 数据库中创建xd_phy_course 数据表,其数据表结构如表 1 所示。使用 Hive 工具来创建数据表 xd_phy_course,将 phy_course_xd.txt 导入到该表中,其中 xd_phy_course表的数据结构如表 2 所示。使用 Sqoop 工具将 hive 数据仓库中的xd_phy_course 表导出到 master 节点的 MySQL 中 xiandain 数据库的xd_phy_course 表。
表 1
stname VARCHAR(20) stID INT(1) class VARCHAR(20) opt_cour VARCHAR(20)
表 2
stname(string) stID (int) class(string) opt_cour(string)
[root@master ~]#mysql -uroot -pbigdata
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 317
Server version: 5.5.44-MariaDB MariaDB Server
Copyright © 2000, 2015, Oracle, MariaDB Corporation Ab and others.
Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.
MariaDB [(none)]> create database xiandian;
Query OK, 1 row affected (0.00 sec)
MariaDB [(none)]> use xiandian;
Database changed
MariaDB [xiandian]> create table xd_phy_couse(stname varchar(20),stID int(1),class varchar(20),opt_cour varchar(20));
Query OK, 0 rows affected (0.13 sec)
[root@master ~]# hive
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in file:/etc/hive/2.6.1.0-129/0/hive-log4j.properties
hive> create table xd_phy_course(stname string,stID int,class string,opt_cour string) row format delimited fields terminated by '\t' lines terminated by '\n';
OK
Time taken: 0.389 seconds
hive> load data local inpath '/opt/phy_course_xd.txt' into table xd_phy_course; Loading data to table default.xd_phy_course
Table default.xd_phy_course stats: [numFiles=1, numRows=0, totalSize=450, rawDataSize=0]
OK
Time taken: 1.339 seconds
[root@master ~]# sqoop export --connect jdbc:mysql://localhost:3306/xiandian --username root --password bigdata --table xd_phy_course --hcatalog-database default --hcatalog-table xd_phy_course
5.在 Hive 中创建 xd_phy_course 数据表,其数据表结构如下表所示。使用Sqoop 工具将 MySQL 中 xiandian 数据库下 xd_phy_course 表导入到 Hive 数据仓库中的 xd_phy_course 表中。
stname(string) stID(int) class(string) opt_cour(string)
[root@master ~]# hive
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in file:/etc/hive/2.6.1.0-129/0/hive-log4j.properties
hive> create table xd_phy_course(stname string,stID int,class string,opt_cour string) row format delimited fields terminated by '\t' lines terminated by '\n';
OK
Time taken: 0.337 seconds
[root@master ~]# sqoop import --connect jdbc:mysql://localhost:3306/xiandian -username root -password bigdata --table xd_phy_course --hive-import --hive-overwrite --hive-table xd_phy_course -m 1 --fields-terminated-by '\t' --lines-terminated-by '\n'
[root@master ~]# hive
log4j:WARN No such property [maxFileSize] in org.apache.log4j.DailyRollingFileAppender.
Logging initialized using configuration in file:/etc/hive/2.6.1.0-129/0/hive-log4j.properties
hive> select * from xd_phy_course;
OK
student409 10120408 Software_1403 volleyball
student411 10120410 Software_1403 volleyball
student413 10120412 Software_1403 volleyball
student419 10120418 Software_1403 volleyball
student421 10120420 Software_1403 volleyball
student422 10120421 Software_1403 volleyball
student424 10120423 Software_1403 volleyball
student432 10120431 Software_1403 volleyball
student438 10120437 Software_1403 volleyball
student447 10120446 Software_1403 volleyball
Time taken: 2.206 seconds, Fetched: 10 row(s)