1.什么是sqoop
Sqoop是一款开源的工具,主要用于在Hadoop与传统的数据库(Mysql、oracle)间进行数据的传递,可以将一个关系型数据库(MySQL ,Oracle 等)中的数据导进到Hadoop的HDFS中,也可以将HDFS的数据导进到关系型数据库中,对于某些NoSQL数据库它也提供了连接器。
简单来说Sqoop就是是一个转换工具,用于在关系型数据库与HDFS之间进行数据转换。
Sqoop官网地址: http://sqoop.apache.org/
2.安装Sqoop
下载 sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
[root@single java]# tar zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz //解压安装
[root@single java]# mv sqoop-1.4.6_hadoop-2.0.4 sqoop //重命名目录
3.设置SQOOP
设置SQOOP_HOME
[root@single java]# vi /etc/profile
export SQOOP_HOME=/usr/java/sqoop //添加变量
进入目录sqoop/conf,重命名sqoop-env-template.sh
[root@single conf]# cp sqoop-env-template.sh sqoop-env.sh
[root@single conf]# vi sqoop-env.sh //编辑sqoop-env.sh
# Set Hadoop-specific environment variables here.
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/usr/java/hadoop-2.6.2
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/usr/java/hadoop-2.6.2
#set the path to where bin/hbase is available
#export HBASE_HOME=
#Set the path to where bin/hive is available
#export HIVE_HOME=
#Set the path for where zookeper config dir is
#export ZOOCFGDIR=
在conf目录下,有两个文件sqoop-site.xml和sqoop-site-template.xml内容是完全一样的,不必在意,我们只关心sqoop-site.xml即可
注释掉hbase和 zookeeper检查,因为目前还没有启用hbase/zookeeper等hadoop上的组件
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HBASE_HOME}" ]; then
# echo "Warning: $HBASE_HOME does not exist! HBase imports will fail."
# echo 'Please set $HBASE_HOME to the root of your HBase installation.'
#fi
## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi
#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
参考资料:http://www.open-open.com/lib/view/open1401346410480.html#_label0