Sqoop 是一个开源工具,主要用于在Hadoop和传统的数据库(Mysql, Oracle,等)进行数据传递,可以将一个关系型数据库中的数据导入Hadoop 的HDFS中,也可以将HDFS中的数据导出到关系型数据库中。之前总结了sqoop1的部署安装,接下来总结一下sqoop2的部署和填坑
一、下载sqoop包
https://downloads.apache.org/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz
二、解压sqoop包并重命名
tar -zxvf sqoop-1.99.7-bin-hadoop200.tar.gz
mv sqoop-1.99.7-bin-hadoop200 sqoop2
三、配置环境变量:vim /etc/profile
sqoop需要hadoop的环境变量配置:$HADOOP_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_MAPRED_HOME,HADOOP_YARN_HOME
export HADOOP_HOME=/usr/local/hadoop-3.1.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME/share/hadoop/common
export HADOOP_HDFS_HOME=$HADOOP_HOME/share/hadoop/hdfs
export HADOOP_MAPRED_HOME=$HADOOP_HOME/share/hadoop/mapreduce
export HADOOP_YARN_HOME=$HADOOP_HOME/share/hadoop/yarn
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HIVE_HOME=/usr/local/hive-3.1.2
export PIG_HOME=/usr/local/pig-0.17.0
export PIG_CLASSPATH=$HADOOP_HOME/conf
export SQOOP_HOME=/usr/local/sqoop2
export SQOOP_SERVER_EXTRA_LIB=/usr/local/sqoop2/extlib/
PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PIG_HOME/bin:$SQOOP_HOME/bin:
备注:也可以直接把变量定义在sqoop.sh文件中
HADOOP_HOME=/usr/local/hadoop-3.1.2/
HADOOP_COMMON_HOME=$HADOOP_HOME/share/hadoop/common
HADOOP_HDFS_HOME=$HADOOP_HOME/share/hadoop/hdfs
HADOOP_MAPRED_HOME=$HADOOP_HOME/share/hadoop/mapreduce
HADOOP_YARN_HOME=$HADOOP_HOME/share/hadoop/yarn
四、hadoop的core-site.xml和container-executor.cfg配置
<!-- core-site.xml配置中追加代理用户 -->
<property>
<name>hadoop.proxyuser.sqoop2.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.sqoop2.groups</name>
<value>*</value>
</property>
<!-- container-executor.cfg 配置中 -->
allowed.system.users=sqoop2 #如果有多用 sqoop2,hadoop,root
五、修改sqoop配置文件 sqoop目录下conf/sqoop.properties
1、@LOGDIR@ 修改为 /usr/local/sqoop2/logs,并在sqoop2目录下创建logs文件夹
2、@BASEDIR@修改为/usr/local/sqoop2
六、添加sqoop2外部引入的jar包
新建目录mkdir /usr/local/sqoop2/extlib/ 把mysql驱动包和derby.jar包存放于该目录下;
开始挖坑了,按照官方文件 添加环境变量 export SQOOP_SERVER_EXTRA_LIB=/usr/local/sqoop2/extlib/
然后source /etc/profile 让文件失效;以上官方部署完成了,然后sqoop2-tool upgrade,直接异常。
第一没有配置org.apache.sqoop.submission.engine.mapreduce.configuration.directory配置文件目录
修改配置org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop-3.1.2/etc/hadoop/
第二次异常找不derby.jar到驱动包
2020-04-13 10:53:44,825 ERROR [org.apache.sqoop.tools.tool.UpgradeTool.runToolWithConfiguration(UpgradeTool.java:55)] Can't finish upgrading RepositoryManager, Driver and ConnectionManager:
org.apache.sqoop.common.SqoopException: JDBCREPO_0007:Unable to lease link
at org.apache.sqoop.repository.JdbcRepositoryTransaction.begin(JdbcRepositoryTransaction.java:63)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:86)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:62)
at org.apache.sqoop.repository.JdbcRepository.createOrUpgradeRepository(JdbcRepository.java:128)
at org.apache.sqoop.repository.RepositoryManager.initialize(RepositoryManager.java:128)
at org.apache.sqoop.tools.tool.UpgradeTool.runToolWithConfiguration(UpgradeTool.java:39)
at org.apache.sqoop.tools.ConfiguredTool.runTool(ConfiguredTool.java:35)
at org.apache.sqoop.tools.ToolRunner.main(ToolRunner.java:72)
Caused by: java.sql.SQLException: No suitable driver found for jdbc:derby:/usr/local/sqoop2/repository/db;create=true
at java.sql.DriverManager.getConnection(DriverManager.java:689)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:78)
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
at org.apache.sqoop.repository.JdbcRepositoryTransaction.begin(JdbcRepositoryTransaction.java:61)
修改sqoop配置文件,强行指定derby.jar包
修改sqoop.properties属性 指定加载jar目录 org.apache.sqoop.connector.external.loadpath=/usr/local/sqoop2/extlib/
具体配置如下
org.apache.sqoop.log4j.debu