- 修改${SQOOP_HOME}/目录下的build.xml对应自己的hadoop版本,找到如下内容:
<elseif>
<equals arg1="${hadoopversion}" arg2="200" />
<then>
<!-- <property name="hadoop.version" value="2.0.4-alpha" />-->
<!--<property name="hbase.version" value="0.94.2" />-->
<!-- <property name="zookeeper.version" value="3.4.2" />-->
<!--<property name="hadoop.version.full" value="2.0.4-alpha" />-->
<!--<property name="hcatalog.version" value="0.11.0" />-->
<property name="hadoop.version" value="2.7.1" />
<property name="hbase.version" value="0.98.17" />
<property name="zookeeper.version" value="3.4.7" />
<property name="hadoop.version.full" value="2.7.1" />
<property name="hcatalog.version" value="0.11.0" />
</then>
</elseif>
- 配置${SQOOP_HOME}/conf/sqoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put Sqoop-specific properties in this file. -->
<configuration>
<!--
Set the value of this property to explicitly enable third-party ManagerFactory plugins.
If this is not used, you can alternately specify a set of ManagerFactories in the $SQOOP_CONF_DIR/managers.d/ subdirectory.
Each file should contain one or more lines like:manager.class.name[=/path/to/containing.jar] Files will be consulted in lexicographical order only if this property is unset.
设置此属性的值以显式启用第三方ManagerFactory插件。如果未使用此选项,则可以在$ SQOOP_CONF_DIR/managers.d/子目录中指定一组ManagerFactories。
每个文件应包含一行或多行,如:manager.class.name [= /path/to/containing.jar]仅当未设置此属性时,才会按字典顺序查询文件。-->
<property>
<name>sqoop.connection.factories</name>
<value>com.cloudera.sqoop.manager.DefaultManagerFactory</value>
<description>A comma-delimited list of ManagerFactory implementations which are
consulted, in order, to instantiate ConnManager instances used to drive connections to databases.
以逗号分隔的ManagerFactory实现列表,按顺序查询,以实例化用于驱动数据库连接的ConnManager实例。
</description>
</property>
<!--
Set the value of this property to enable third-party tools.If this is not used, you can alternately specify a set of ToolPlugins in the $SQOOP_CONF_DIR/tools.d/ subdirectory.
Each file should contain one or more lines like:plugin.class.name[=/path/to/containing.jar] Files will be consulted in lexicographical order only if this property is unset.
设置此属性的值以启用第三方工具。如果未使用此选项,则可以在$ SQOOP_CONF_DIR/tools.d/子目录中指定一组ToolPlugins。
每个文件应包含一行或多行,如:plugin.class.name [= /path /to/containing.jar]仅当未设置此属性时,才会按字典顺序查询文件。 -->
<property>
<name>sqoop.tool.plugins</name>
<value></value>
<description>A comma-delimited list of ToolPlugin implementations which are consulted, in order, to register SqoopTool instances which allow third-party tools to be used.
以逗号分隔的ToolPlugin实现列表,按顺序查阅,以注册允许使用第三方工具的SqoopTool实例。 </description>
</property>
<!--
By default, the Sqoop metastore will auto-connect to a local embedded database stored in ~/.sqoop/.
To disable metastore auto-connect, uncomment this next property.
默认情况下,Sqoop Metastore将自动连接到存储在〜/ .sqoop /中的本地嵌入式数据库。
要禁用Metastore自动连接,请取消注释下一个属性。-->
<property>
<name>sqoop.metastore.client.enable.autoconnect</name>
<!--<value>false</value>-->
<!--设置为true后,操作job时不需要再使用meta-connect参数。-->
<value>true</value>
<description>If true, Sqoop will connect to a local metastore for job management when no other metastore arguments are provided.
如果为true,则在没有提供其他Metastore参数时,Sqoop将连接到本地Metastore以进行作业管理。 </description>
</property>
<!--
The auto-connect metastore is stored in ~/.sqoop/.
Uncomment these next arguments to control the auto-connect process with greater precision.
自动连接的Metastore存储在〜/ .sqoop /中。
取消注释这些下一个参数,以更高的精度控制自动连接过程。
-->
<property>
<name>sqoop.metastore.client.autoconnect.url</name>
<value>jdbc:hsqldb:hsql://hadoop03:16000/sqoop</value>
<description>The connect string to use when connecting to a job-management metastore. If unspecified, uses ~/.sqoop/.
You can specify a different path here.
连接到作业管理Metastore时使用的连接字符串。 如果未指定,则使用〜/ .sqoop /。
您可以在此处指定其他路径。</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.username</name>
<!--<value>SA</value>-->
<value>SA</value>
<description>The username to bind to the metastore.
</description>
</property>
<property>
<name>sqoop.metastore.client.autoconnect.password</name>
<value></value>
<description>The password to bind to the metastore.
</description>
</property>
<!--
For security reasons, by default your database password will not be stored in the Sqoop metastore.
When executing a saved job, you will need to reenter the database password.
Uncomment this setting to enable saved password storage. (INSECURE!)
出于安全原因,默认情况下,您的数据库密码不会存储在Sqoop Metastore中。
执行保存的作业时,您需要重新输入数据库密码。
取消注释此设置以启用保存的密码存储。(不安全!)-->
<property>
<name>sqoop.metastore.client.record.password</name>
<value>true</value>
<description>If true, allow saved passwords in the metastore.
</description>
</property>
<!--
SERVER CONFIGURATION: If you plan to run a Sqoop metastore on this machine,you should uncomment and set these parameters appropriately.
You should then configure clients with:sqoop.metastore.client.autoconnect.url =jdbc:hsqldb:hsql://<server-name>:<port>/sqoop
服务器配置:如果您计划在此计算机上运行Sqoop Metastore,则应取消注释并正确设置这些参数。
然后,您应该使用以下命令配置客户端:sqoop.metastore.client.autoconnect.url = jdbc:hsqldb:hsql://服务器名称:端口/sqoop
-->
<property>
<name>sqoop.metastore.server.location</name>
<value>/home/software/sqoop/sqoop-metastore/shared.db</value>
<description>Path to the shared metastore database files.
If this is not set, it will be placed in ~/.sqoop/.
共享Metastore数据库文件的路径。如果未设置,则将其放在〜/ .sqoop /中。 </description>
</property>
<property>
<name>sqoop.metastore.server.port</name>
<value>16000</value>
<description>Port that this metastore should listen on.
</description>
</property>
</configuration>
- 配置${SQOOP_HOME}/conf/sqoop-env.sh
# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*
# Set Hadoop-specific environment variables here.
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop-2.7.1
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop-2.7.1
#set the path to where bin/hbase is available
export HBASE_HOME=/home/software/hbase-0.98.17-hadoop2
#Set the path to where bin/hive is available
export HIVE_HOME=/home/software/apache-hive-1.2.0-bin
#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/software/zookeeper-3.4.7
- 分发配置到集群其他机器,hadoop03作为job存储数据的节点,让其他client共享job数据.
- 配置环境变量。
- 输入命令sqoop version ,查看版本信息。
- 在hadoop03启动metastore:
nohup sqoop-metastore &
启动成功后,能看到sqoop的进程
- 查看job列表 :
sqoop job \
--meta-connect jdbc:hsqldb:hsql://hadoop03:16000/sqoop \
--list由于sqoop-site.xml中配置了Metastore自动连接,故操作job时不需要再使用--meta-connect参数。
sqoop job --list即可查看。