大数据平台运维之Sqoop

大数据系列之运维(自主搭建的大数据平台)

(6)Sqoop运维
  1. 在 master 节点查看Sqoop 的版本信息。
[root@master ~]# sqoop version
20/03/31 16:15:45 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017

  1. 使用 Sqoop 工具列出 master 节点中 MySQL 中所有数据库。(我的数据库装在了slave2节点)

这里要注意一下:在先电版大数据中,有时候执行下面相关语句会报错,可能是root用户的权限不够,就算su - sqoop下也会报错。这时应该到数据库里操作,为相关用户赋予相关权限。这里以root用户为例。

MariaDB [(none)]> grant all privileges on *.* to 'root'@'localhost' identified by '000000';
MariaDB [(none)]> grant all privileges on *.* to 'root'@'%' identified by '000000';

[root@master ~]# sqoop list-databases --connect jdbc:mysql://slave2:3306 --username root --password 000000
20/03/31 16:18:37 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
hive
mysql
performance_schema

  1. 使用Sqoop工具列出master 节点中MySQL中mysql数据库中所有的数据表。
[root@master ~]# sqoop list-tables --connect jdbc:mysql://slave2:3306/mysql --username root --password 000000
Warning: /usr/sqoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/sqoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/03/31 16:27:55 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
20/03/31 16:27:55 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/03/31 16:27:56 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
columns_priv
db
event
func
general_log
help_category
help_keyword
help_relation
help_topic
host
ndb_binlog_index
plugin
proc
procs_priv
proxies_priv
servers
slow_log
tables_priv
time_zone
time_zone_leap_second
time_zone_name
time_zone_transition
time_zone_transition_type
user

  1. 在 MySQL 中创建名为 xiandian 的数据库,在 xiandian 数据库中创建xd_phy_course 数据表,其数据表结构如表 1 所示。使用 Hive 工具来创建数据表 xd_phy_course,将 phy_course_xd.txt 导入到该表中,其中 xd_phy_course表的数据结构如表 2 所示。使用 Sqoop 工具将 hive 数据仓库中的xd_phy_course 数据导出到 master 节点的 MySQL 中 xiandain 数据库的xd_phy_course 表内。

在这里插入图片描述
在这里插入图片描述

MariaDB [xiandian]> create table xd_phy_course(
    -> stname VARCHAR(20),
    -> stID INT(1),
    -> class VARCHAR(20),
    -> opt_cour VARCHAR(20)
    -> );
Query OK, 0 rows affected (0.00 sec)
hive> create table xd_phy_course (stname string,stID int,class string,opt_cour string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
OK
Time taken: 3.45 seconds
hive> load data local inpath'/root/tiku/Hive/phy_course_xd.txt' into table xd_phy_course;
Loading data to table default.xd_phy_course
OK
Time taken: 1.442 seconds

[root@master ~]# sqoop export --connect jdbc:mysql://slave2:3306/xiandian --username root --password 000000 --table xd_phy_course --hcatalog-database default --hcatalog-table xd_phy_course
20/03/31 17:19:19 INFO mapred.FileInputFormat: Total input paths to process : 1
20/03/31 17:19:19 INFO mapreduce.JobSubmitter: number of splits:1
20/03/31 17:19:19 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
20/03/31 17:19:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1585642380845_0001
20/03/31 17:19:22 INFO impl.YarnClientImpl: Submitted application application_1585642380845_0001
20/03/31 17:19:22 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1585642380845_0001/
20/03/31 17:19:22 INFO mapreduce.Job: Running job: job_1585642380845_0001
20/03/31 17:19:57 INFO mapreduce.Job: Job job_1585642380845_0001 running in uber mode : false
20/03/31 17:19:57 INFO mapreduce.Job:  map 0% reduce 0%
20/03/31 17:20:16 INFO mapreduce.Job:  map 100% reduce 0%
20/03/31 17:20:18 INFO mapreduce.Job: Job job_1585642380845_0001 completed successfully
20/03/31 17:20:18 INFO mapreduce.Job: Counters: 30
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=307992
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=92951
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=2
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Launched map tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=13249
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=13249
		Total vcore-milliseconds taken by all map tasks=13249
		Total megabyte-milliseconds taken by all map tasks=13566976
	Map-Reduce Framework
		Map input records=2050
		Map output records=2050
		Input split bytes=1457
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=1223
		CPU time spent (ms)=3290
		Physical memory (bytes) snapshot=195649536
		Virtual memory (bytes) snapshot=2117246976
		Total committed heap usage (bytes)=96468992
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=0
20/03/31 17:20:18 INFO mapreduce.ExportJobBase: Transferred 90.7725 KB in 78.4333 seconds (1.1573 KB/sec)
20/03/31 17:20:18 INFO mapreduce.ExportJobBase: Exported 2050 records.
20/03/31 17:20:18 INFO mapreduce.ExportJobBase: Publishing HCatalog export job data to Listeners

到数据库区验证:

MariaDB [xiandian]> select * from xd_phy_course;
| student2036 | 10122035 | ClOfih_1407   | football     |
| student2037 | 10122036 | ClOfih_1407   | football     |
| student2038 | 10122037 | ClOfih_1407   | table tennis |
| student2039 | 10122038 | ClOfih_1407   | badminton    |
| student2040 | 10122039 | ClOfih_1407   | volleyball   |
| student2041 | 10122040 | ClOfih_1407   | tennis       |
| student2042 | 10122041 | ClOfih_1407   | volleyball   |
| student2043 | 10122042 | ClOfih_1407   | volleyball   |
| student2044 | 10122043 | ClOfih_1407   | tennis       |
| student2045 | 10122044 | ClOfih_1407   | volleyball   |
| student2046 | 10122045 | ClOfih_1407   | basketball   |
| student2047 | 10122046 | ClOfih_1407   | taekwondo    |
| student2048 | 10122047 | ClOfih_1407   | taekwondo    |
| student2049 | 10122048 | ClOfih_1407   | table tennis |
+-------------+----------+---------------+--------------+
2050 rows in set (0.01 sec)

如果出现问题:

Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/hive/hcatalog/mapreduce/HCatOutputFormat

原因:
您正面临此问题,因为未设置HCAT_HOME路径。
解决方法:
到hive的安装目录下查看是否有hcatalog目录,和bin目录在同一级。
将此目录的路径放在/etc/profile文件里:

export HCAT_HOME=/usr/hive/apache-hive-2.1.1-bin/hcatalog

然后source。有sqoop的各个主机都得配置。

  1. 在 Hive 中创建 xd_phy_course 数据表,其数据表结构如下表所示。使用Sqoop 工具将 MySQL 中 xiandian 数据库下 xd_phy_course 表导入到 Hive 数据仓库中的 xd_phy_course 表中。

在这里插入图片描述

hive> create table xd_phy_course (stname string,stID int,class string,opt_cour string) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
OK
Time taken: 0.663 seconds
[root@master ~]# sqoop import --connect jdbc:mysql://slave2:3306/xiandian --username root --password 000000 --table xd_phy_course --hive-import --hive-database default --hive-table xd_phy_course --fields-terminated-by '\t' --lines-terminated-by '\n' -m 1;
20/03/31 17:50:58 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.

如果出现问题:

2020-03-31 17:50:47,328 main ERROR Could not register mbeans java.security.AccessControlException: access denied (“javax.management.MBeanTrustPermission” “register”)

原因:java 的安全策略问题
解决方法:
到Java安装的目录下的jre/lib/security目录里有个java.policy文件(这是我的路径/usr/java/jdk1.8.0_171/jre/lib/security)
在java.policy文件里面的grant部分添加如下内容:

permission javax.management.MBeanTrustPermission "register";

在这里插入图片描述

运行Sqoop报错:Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/Logical
将avro-1.8.1.jar包放到Sqoop的lib目录下,问题解决。

运行Sqoop报错:找不到或无法加载主类 org.apache.sqoop.sqoop
将sqoop-1.4.7.jar包放到Sqoop的lib目录下,问题解决。

运行Sqoop报错:ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: Could not load db driver class:
com.mysql.jdbc.Driver
java.lang.RuntimeException: Could not load db driver class: com.mysql.jdbc.Driver
将mysql的Java驱动包sqoop的lib 目录下,问题解决。

在此感谢先电云提供的题库。
感谢Apache开源技术服务支持
感谢抛物线、mn525520、菜鸟一枚2019三位博主的相关博客。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

希望不是梦

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值