Sqoop 1.X配置

  • 修改${SQOOP_HOME}/目录下的build.xml对应自己的hadoop版本,找到如下内容:
    <elseif>
      <equals arg1="${hadoopversion}" arg2="200" />
      <then>
       <!-- <property name="hadoop.version" value="2.0.4-alpha" />-->
        <!--<property name="hbase.version" value="0.94.2" />-->
       <!-- <property name="zookeeper.version" value="3.4.2" />-->
        <!--<property name="hadoop.version.full" value="2.0.4-alpha" />-->
        <!--<property name="hcatalog.version" value="0.11.0" />-->
        
        <property name="hadoop.version" value="2.7.1" />
        <property name="hbase.version" value="0.98.17" />
        <property name="zookeeper.version" value="3.4.7" />
        <property name="hadoop.version.full" value="2.7.1" />
        <property name="hcatalog.version" value="0.11.0" />
      </then>
    </elseif>
  • 配置${SQOOP_HOME}/conf/sqoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put Sqoop-specific properties in this file. -->

<configuration>
  <!--
    Set the value of this property to explicitly enable third-party ManagerFactory plugins.
If this is not used, you can alternately specify a set of ManagerFactories in the $SQOOP_CONF_DIR/managers.d/ subdirectory.  
Each file should contain one or more lines like:manager.class.name[=/path/to/containing.jar] Files will be consulted in lexicographical order only if this property is unset.
    设置此属性的值以显式启用第三方ManagerFactory插件。如果未使用此选项,则可以在$ SQOOP_CONF_DIR/managers.d/子目录中指定一组ManagerFactories。
 每个文件应包含一行或多行,如:manager.class.name [= /path/to/containing.jar]仅当未设置此属性时,才会按字典顺序查询文件。-->
  <property>
    <name>sqoop.connection.factories</name>
    <value>com.cloudera.sqoop.manager.DefaultManagerFactory</value>
    <description>A comma-delimited list of ManagerFactory implementations which are 
consulted, in order, to instantiate ConnManager instances used to drive connections to databases.
    以逗号分隔的ManagerFactory实现列表,按顺序查询,以实例化用于驱动数据库连接的ConnManager实例。
    </description>
  </property>

  <!--
    Set the value of this property to enable third-party tools.If this is not used, you can alternately specify a set of ToolPlugins in the $SQOOP_CONF_DIR/tools.d/ subdirectory.  
Each file should contain one or more lines like:plugin.class.name[=/path/to/containing.jar] Files will be consulted in lexicographical order only if this property is unset.
    设置此属性的值以启用第三方工具。如果未使用此选项,则可以在$ SQOOP_CONF_DIR/tools.d/子目录中指定一组ToolPlugins。 
每个文件应包含一行或多行,如:plugin.class.name [= /path /to/containing.jar]仅当未设置此属性时,才会按字典顺序查询文件。 -->
  <property>
    <name>sqoop.tool.plugins</name>
    <value></value>
    <description>A comma-delimited list of ToolPlugin implementations which are consulted, in order, to register SqoopTool instances which allow third-party tools to be used.
    以逗号分隔的ToolPlugin实现列表,按顺序查阅,以注册允许使用第三方工具的SqoopTool实例。    </description>
  </property>

  <!--
    By default, the Sqoop metastore will auto-connect to a local embedded database stored in ~/.sqoop/. 
To disable metastore auto-connect, uncomment this next property.
    默认情况下,Sqoop Metastore将自动连接到存储在〜/ .sqoop /中的本地嵌入式数据库。
 要禁用Metastore自动连接,请取消注释下一个属性。-->
  <property>
    <name>sqoop.metastore.client.enable.autoconnect</name>
    <!--<value>false</value>-->
    <!--设置为true后,操作job时不需要再使用meta-connect参数。-->
    <value>true</value>
    <description>If true, Sqoop will connect to a local metastore for job management when no other metastore arguments are provided.
    如果为true,则在没有提供其他Metastore参数时,Sqoop将连接到本地Metastore以进行作业管理。   </description>
  </property>

  <!--
    The auto-connect metastore is stored in ~/.sqoop/. 
Uncomment these next arguments to control the auto-connect process with greater precision.
    自动连接的Metastore存储在〜/ .sqoop /中。 
取消注释这些下一个参数,以更高的精度控制自动连接过程。
  -->
  <property>
    <name>sqoop.metastore.client.autoconnect.url</name>
    <value>jdbc:hsqldb:hsql://hadoop03:16000/sqoop</value>
    <description>The connect string to use when connecting to a job-management metastore. If unspecified, uses ~/.sqoop/. 
You can specify a different path here.
    连接到作业管理Metastore时使用的连接字符串。 如果未指定,则使用〜/ .sqoop /。
 您可以在此处指定其他路径。</description>
  </property>
  <property>
    <name>sqoop.metastore.client.autoconnect.username</name>
    <!--<value>SA</value>-->
    <value>SA</value>
    <description>The username to bind to the metastore.
    </description>
  </property>
  <property>
    <name>sqoop.metastore.client.autoconnect.password</name>
    <value></value>
    <description>The password to bind to the metastore.
    </description>
  </property>

  <!--
    For security reasons, by default your database password will not be stored in the Sqoop metastore. 
When executing a saved job, you will need to reenter the database password. 
Uncomment this setting to enable saved password storage. (INSECURE!)
    出于安全原因,默认情况下,您的数据库密码不会存储在Sqoop Metastore中。 
执行保存的作业时,您需要重新输入数据库密码。
 取消注释此设置以启用保存的密码存储。(不安全!)-->
  <property>
    <name>sqoop.metastore.client.record.password</name>
    <value>true</value>
    <description>If true, allow saved passwords in the metastore.
    </description>
  </property>


  <!--
    SERVER CONFIGURATION: If you plan to run a Sqoop metastore on this machine,you should uncomment and set these parameters appropriately.
You should then configure clients with:sqoop.metastore.client.autoconnect.url =jdbc:hsqldb:hsql://&lt;server-name&gt;:&lt;port&gt;/sqoop
    服务器配置:如果您计划在此计算机上运行Sqoop Metastore,则应取消注释并正确设置这些参数。
然后,您应该使用以下命令配置客户端:sqoop.metastore.client.autoconnect.url = jdbc:hsqldb:hsql://服务器名称:端口/sqoop
  -->
  <property>
    <name>sqoop.metastore.server.location</name>
    <value>/home/software/sqoop/sqoop-metastore/shared.db</value>
    <description>Path to the shared metastore database files.
If this is not set, it will be placed in ~/.sqoop/.
    共享Metastore数据库文件的路径。如果未设置,则将其放在〜/ .sqoop /中。  </description>
  </property>

  <property>
    <name>sqoop.metastore.server.port</name>
    <value>16000</value>
    <description>Port that this metastore should listen on.
    </description>
  </property>

</configuration>
  • 配置${SQOOP_HOME}/conf/sqoop-env.sh
# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop-2.7.1

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop-2.7.1

#set the path to where bin/hbase is available
export HBASE_HOME=/home/software/hbase-0.98.17-hadoop2

#Set the path to where bin/hive is available
export HIVE_HOME=/home/software/apache-hive-1.2.0-bin

#Set the path for where zookeper config dir is
export ZOOCFGDIR=/home/software/zookeeper-3.4.7
  •  分发配置到集群其他机器,hadoop03作为job存储数据的节点,让其他client共享job数据.
  • 配置环境变量。
  • 输入命令sqoop version ,查看版本信息。
  • 在hadoop03启动metastore:

nohup  sqoop-metastore  &

启动成功后,能看到sqoop的进程

  • 查看job列表 :

sqoop job \
--meta-connect jdbc:hsqldb:hsql://hadoop03:16000/sqoop \
--list

由于sqoop-site.xml中配置了Metastore自动连接,故操作job时不需要再使用--meta-connect参数。

sqoop job --list即可查看。

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值