CnetOS7 搭建Sqoop-1.4.7(Hadoop3.0.3+HBase+Hive+zookeeper)

1、Centos7 搭建Hadoop 集群

               参考文章地址:https://blog.csdn.net/zhouzhiwengang/article/details/94549964

2、CentOS7 搭建HBase 

               参考文章地址:https://blog.csdn.net/zhouzhiwengang/article/details/94616635

3、CentOS7 搭建Hive

               参考文章地址:https://blog.csdn.net/zhouzhiwengang/article/details/94576029

4、CentOS7 搭建Sqoop

第一步:下载Sqoop ,官网下载地址:http://mirrors.hust.edu.cn/apache/sqoop/1.4.7/

第二步:[官网][1]下载Sqoop,我这里选择的是1.4.7版本

第三步:通过FTP(WinSCP,FileZilla等)工具上传到指令目录 /usr/local/hadoop

第四步:使用解压命令解压到当前目录并对文件夹重新命令:

tar zxvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz #解压
mv sqoop-1.4.7.bin__hadoop-2.6.0 sqoop-1.4.7   #文件夹重命名

第五步:编辑全局变量,新增sqoop 全局环境变量,编辑vi /etc/profile 文件

export SQOOP_HOME=/usr/local/hadoop/sqoop-1.4.7
export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$HADOOP_HOME/bin:$SQOOP_HOME/bin:$ZOOKEEPER_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin

 执行source /etc/profile,使配置的全局环境变量生效。

第六步:进入sqoop 的安装目录(/usr/local/hadoop/sqoop-1.4.7/conf)文件夹,执行如下指令:

cd /usr/local/hadoop/sqoop-1.4.7/conf

将sqoop-env-template.sh复制一份,并取名为sqoop-env.sh,也就是执行命令:

cp    sqoop-env-template.sh  sqoop-env.sh

编辑sqoop 的配置文件sqoop-env.sh, 编辑内容,我已经添加相关注释说明:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/usr/local/hadoop/hadoop-3.0.3  #设置Hadoop 安装目录

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/usr/local/hadoop/hadoop-3.0.3  #设置Hadoop 安装目录

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
export HIVE_HOME=/usr/local/hadoop/hive-2.3.5  #设置Hive安装目录

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=

第七步:将MySQL8 驱动包,拷贝至Sqoop 安装目录下的lib 文件夹(/usr/local/hadoop/sqoop-1.4.7/lib)

温馨提示:sqoop是一个工具,安装完成后,如果操作的命令不涉及hive和hadoop的,可以实现不启动hive和hadoop,直接输入sqoop命令即可,例如sqoop help命令。要使用hive相关的命令,必须事先启动hive和hadoop。

使用Sqoop简单教程如下:

1、使用help 指令: sqoop help

[root@master /]# sqoop help
2019-07-09 16:07:22,624 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import datasets from a mainframe server to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.

2、查看指定MySQL8数据库涉及的表

指令如下:sqoop   list-tables  --username  admini --password  '123456'     --connect     jdbc:mysql://192.168.60.206:3306/test?characterEncoding=UTF-8

[root@master /]# sqoop   list-tables  --username  admini --password  '123456'     --connect     jdbc:mysql://192.168.60.206:3306/test?characterEncoding=UTF-8
2019-07-09 16:09:14,346 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2019-07-09 16:09:14,424 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2019-07-09 16:09:14,539 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
user_test

3、MySQL 8 指定数据库中的表导入Hive中

指令如下: sqoop import  --connect  jdbc:mysql://192.168.60.206:3306/test?characterEncoding=UTF-8  --username admini  --password 123456  --table user_test  -m 1  --hive-import  --create-hive-table   --hive-table  emp_mysql

[root@master bin]# sqoop import  --connect  jdbc:mysql://192.168.60.206:3306/test?characterEncoding=UTF-8  --username admini  --password 123456  --table user_test  -m 1  --hive-import  --create-hive-table   --hive-table  emp_mysql
2019-07-09 15:02:36,323 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
2019-07-09 15:02:36,398 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
2019-07-09 15:02:36,399 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
2019-07-09 15:02:36,399 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
2019-07-09 15:02:36,522 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
2019-07-09 15:02:36,531 INFO tool.CodeGenTool: Beginning code generation
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
2019-07-09 15:02:37,124 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_test` AS t LIMIT 1
2019-07-09 15:02:37,165 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_test` AS t LIMIT 1
2019-07-09 15:02:37,175 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/local/hadoop/hadoop-3.0.3
注: /tmp/sqoop-root/compile/68589ffcf4797f9934b62d0918a31c53/user_test.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
2019-07-09 15:02:40,008 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/68589ffcf4797f9934b62d0918a31c53/user_test.jar
2019-07-09 15:02:40,023 WARN manager.MySQLManager: It looks like you are importing from mysql.
2019-07-09 15:02:40,023 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
2019-07-09 15:02:40,023 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
2019-07-09 15:02:40,023 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
2019-07-09 15:02:40,032 INFO mapreduce.ImportJobBase: Beginning import of user_test
2019-07-09 15:02:40,033 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
2019-07-09 15:02:40,174 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
2019-07-09 15:02:41,056 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2019-07-09 15:02:41,192 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.60.204:8032
2019-07-09 15:02:42,312 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1562653081455_0001
2019-07-09 15:02:50,559 INFO db.DBInputFormat: Using read commited transaction isolation
2019-07-09 15:02:51,051 INFO mapreduce.JobSubmitter: number of splits:1
2019-07-09 15:02:51,128 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
2019-07-09 15:02:51,639 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1562653081455_0001
2019-07-09 15:02:51,645 INFO mapreduce.JobSubmitter: Executing with tokens: []
2019-07-09 15:02:51,905 INFO conf.Configuration: resource-types.xml not found
2019-07-09 15:02:51,905 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
2019-07-09 15:02:52,608 INFO impl.YarnClientImpl: Submitted application application_1562653081455_0001
2019-07-09 15:02:52,682 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1562653081455_0001/
2019-07-09 15:02:52,682 INFO mapreduce.Job: Running job: job_1562653081455_0001
2019-07-09 15:03:03,997 INFO mapreduce.Job: Job job_1562653081455_0001 running in uber mode : false
2019-07-09 15:03:04,003 INFO mapreduce.Job:  map 0% reduce 0%
2019-07-09 15:03:12,250 INFO mapreduce.Job:  map 100% reduce 0%
2019-07-09 15:03:13,276 INFO mapreduce.Job: Job job_1562653081455_0001 completed successfully
2019-07-09 15:03:13,465 INFO mapreduce.Job: Counters: 32
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=211175
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=87
		HDFS: Number of bytes written=17
		HDFS: Number of read operations=6
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Other local map tasks=1
		Total time spent by all maps in occupied slots (ms)=4996
		Total time spent by all reduces in occupied slots (ms)=0
		Total time spent by all map tasks (ms)=4996
		Total vcore-milliseconds taken by all map tasks=4996
		Total megabyte-milliseconds taken by all map tasks=5115904
	Map-Reduce Framework
		Map input records=2
		Map output records=2
		Input split bytes=87
		Spilled Records=0
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=533
		CPU time spent (ms)=2520
		Physical memory (bytes) snapshot=225423360
		Virtual memory (bytes) snapshot=2804236288
		Total committed heap usage (bytes)=159383552
		Peak Map Physical memory (bytes)=225423360
		Peak Map Virtual memory (bytes)=2804236288
	File Input Format Counters 
		Bytes Read=0
	File Output Format Counters 
		Bytes Written=17
2019-07-09 15:03:13,488 INFO mapreduce.ImportJobBase: Transferred 17 bytes in 32.414 seconds (0.5245 bytes/sec)
2019-07-09 15:03:13,494 INFO mapreduce.ImportJobBase: Retrieved 2 records.
2019-07-09 15:03:13,494 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table user_test
2019-07-09 15:03:13,552 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `user_test` AS t LIMIT 1
2019-07-09 15:03:13,570 INFO hive.HiveImport: Loading uploaded data into Hive
2019-07-09 15:03:13,582 INFO conf.HiveConf: Found configuration file null
2019-07-09 15:03:20,586 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.
2019-07-09 15:03:20,587 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hadoop/hive-2.3.5/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2019-07-09 15:03:20,587 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.0.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
2019-07-09 15:03:20,587 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2019-07-09 15:03:20,617 INFO hive.HiveImport: SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
2019-07-09 15:03:26,335 INFO hive.HiveImport: 
2019-07-09 15:03:26,335 INFO hive.HiveImport: Logging initialized using configuration in jar:file:/usr/local/hadoop/hive-2.3.5/lib/hive-common-2.3.5.jar!/hive-log4j2.properties Async: true
2019-07-09 15:03:36,139 INFO hive.HiveImport: OK
2019-07-09 15:03:36,147 INFO hive.HiveImport: Time taken: 8.225 seconds
2019-07-09 15:03:36,740 INFO hive.HiveImport: Loading data to table default.emp_mysql
2019-07-09 15:03:38,031 INFO hive.HiveImport: OK
2019-07-09 15:03:38,031 INFO hive.HiveImport: Time taken: 1.876 seconds
2019-07-09 15:03:38,610 INFO hive.HiveImport: Hive import complete.
2019-07-09 15:03:38,629 INFO hive.HiveImport: Export directory is contains the _SUCCESS file only, removing the directory.
[root@master bin]# pwd
/usr/local/hadoop/hive-2.3.5/bin
[root@master bin]# hive shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hive-2.3.5/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop-3.0.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/usr/local/hadoop/hive-2.3.5/lib/hive-common-2.3.5.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> use databases;
FAILED: SemanticException [Error 10072]: Database does not exist: databases
hive> show databases;
OK
db_hive_edu
default
user_test
Time taken: 0.294 seconds, Fetched: 3 row(s)
hive> use default;
OK
Time taken: 0.078 seconds
hive> show tables;
OK
emp_mysql
pokes
Time taken: 0.083 seconds, Fetched: 2 row(s)
hive> select * from emp_mysql;
OK
28	1	zzg
28	2	wz
Time taken: 1.579 seconds, Fetched: 2 row(s)

 

  • 1
    点赞
  • 11
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值