使用Sqoop建立MYSql和Hadoop(Hive)之间的有效连接,相互间数据转换和传输。
参考真正了解sqoop的一切 - 简书 (jianshu.com)进行Sqoop的下载安装。我们将Sqoop下载、解压和安装在Client1的机器上。
官方Sqoop已经停止更新,下载转向Apache Sqoop - Apache Attic。
下载Sqoop 1,最新版是1.4.7。
tar xfvz sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz
mv sqoop-1.4.7.bin__hadoop-2.6.0 /root/hadoop
修改配置文件。
cd hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/conf
cp sqoop-env-template.sh sqoop-env.sh
vim sqoop-env.sh。
export HADOOP_COMMON_HOME=/root/hadoop/hadoop-2.7.7
export HADOOP_MAPRED_HOME=/root/hadoop/hadoop-2.7.7
export HIVE_HOME=/root/hacoop/apache-hive-2.3.4-bin
修改环境变量.bashrc。
SQOOP_HOME=/root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0
PATH=$PATH:$SQOOP_HOME/bin
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HADOOP_CLASSPATH=$HADOOP_HOME/lib
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*
拷贝hive-site.xml。
cp hadoop/apache-hive-2.3.4-bin/conf/hive-site.xml hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/conf
修改java 的安全策略。
cd /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/security
vi java.policy,在java.policy文件里面的grant部分。
permission javax.management.MBeanTrustPermission "register";
现Sqoop下jackson包和Hive的有冲突。备份sqoop lib目录下的所有jackson jar包,将hive lib下的jackson jar包拷贝到sqoop lib目录下。
cd hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/lib
mkdir jackson-backup
mv jackson*.jar jackson-backup
cp /root/hadoop/apache-hive-2.3.4-bin/lib/jackson*.jar .
使用参考大数据开发之Sqoop详细介绍 - 知乎 (zhihu.com)。
测试Sqoop。
root@client1:~# sqoop help
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/hadoop/apache-zookeeper-2.6.3-bin does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
22/05/23 11:20:46 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
usage: sqoop COMMAND [ARGS]
Available commands:
codegen Generate code to interact with database records
create-hive-table Import a table definition into Hive
eval Evaluate a SQL statement and display the results
export Export an HDFS directory to a database table
help List available commands
import Import a table from a database to HDFS
import-all-tables Import tables from a database to HDFS
import-mainframe Import datasets from a mainframe server to HDFS
job Work with saved jobs
list-databases List available databases on a server
list-tables List available tables in a database
merge Merge results of incremental imports
metastore Run a standalone Sqoop metastore
version Display version information
See 'sqoop help COMMAND' for information on a specific command.
root@client1:~#
root@client1:~# sqoop list-databases --connect jdbc:mysql://mysqls:3306/ --username root -P
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /root/hadoop/apache-zookeeper-2.6.3-bin does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
22/05/23 11:22:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Enter password:
22/05/23 11:22:27 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/root/hadoop/hadoop-2.7.7/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/hadoop/apache-hive-2.3.4-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Mon May 23 11:22:27 CST 2022 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
information_schema
hive
mysql
performance_schema
shtd_store
sys
root@client1:~#
把Mysql表导入Hive。
root@client1:~# sqoop import --hive-import --hive-overwrite --connect jdbc:mysql://mysqls:3306/shtd_store --username hive -P --hive-database shtd_store --table abc --hive-table abc -m 1
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /root/hadoop/sqoop-1.4.7.bin__hadoop-2.6.0/../accumulo does not e