我们来演示一下如何利用Herriot框架来跑一个测试用例。Herriot框架完全可以用于真实的HDFS分布式集群,但是为了方便起见,这个例子采用了HDFS的伪分布式集群,即在同一台机器上创建了一个namenode进程和一个datanode进程,并且在该机器上运行Herriot自带的测试用例TestHL040(hdfs工程的src/test/system/test目录下)。
1. 从Hadoop社区上下载hadoop-0.21.0.tar.gz,解压到Linux机器上,如/opt/hadoop/hadoop-0.21.0。
2. 进入到/opt/hadoop/hadoop-0.21.0/hdfs目录下,创建lib目录的符号链接:
ln –s ../lib lib
3. 修改文件:/opt/hadoop/hadoop-0.21.0/hdfs/ivy/libraries.properties:
hadoop-common.version=0.21.0-SNAPSHOT
hadoop-hdfs.version=0.21.0-SNAPSHOT
4. 修改文件:/opt/hadoop/hadoop-0.21.0/hdfs/build.xml:
file="${system-test-build-dir}/ivy/lib/${ant.project.name}/system/hadoop-common-${herriot.suffix}-${hadoop-common.version}.jar"
5. 在/opt/hadoop/hadoop-0.21.0/hdfs/目录下运行,运行成功后会生成build-fi目录:
ant binary-system
6. 设置环境变量$JAVA_HOME,并保证其生效:
echo “export JAVA_HOME=/etc/alternatives/java_sdk” >> ~/.bashrc
7. 设置环境变量$HADOOP_HOME,并保证其生效:
exportHADOOP_HOME=/opt/hadoop/hadoop-0.21.0/hdfs/build-fi/system/hadoop-hdfs-0.21.1-SNAPSHOT
8. 设置环境变量$HADOOP_CONF_DIR,并保证其生效:
exportHADOOP_CONF_DIR=/opt/hadoop/hadoop-0.21.0/hdfs/build/test/conf
9. 在$HADOOP_CONF_DIR中放置文件:hdfs-site.xml,为了实现伪分布式集群,hdfs-site.xml需要如下配置:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9981</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>0.0.0.0:9982</value>
</property>
</configuration>
10. 在$HADOOP_CONF_DIR中放置文件:masters(留空)
11. 在$HADOOP_CONF_DIR中放置文件:slaves,并执行如下命令:
echo “localhost” > slaves
cp slaves slaves.copy
12. 在$HADOOP_CONF_DIR中放置文件:system-test.xml,配置如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<!-- Mandatory properties that are to be set and uncommented before running the tests -->
<property>
<name>test.system.hdrc.hadoophome</name>
<value>/opt/hadoop/hadoop-0.21.0/hdfs/build-fi/system/hadoop-hdfs-0.21.1-SNAPSHOT</value>
<description> This is the path to the home directory of the hadoop deployment.
</description>
</property>
<property>
<name>test.system.hdrc.hadoopconfdir</name>
<value>/opt/hadoop/hadoop-0.21.0/hdfs/build/test/conf</value>
<description> This is the path to the configuration directory of the hadoop
cluster that is deployed.
</description>
</property>
<property>
<name>test.system.hdrc.dn.hostfile</name>
<value>slaves.copy</value>
<description> File name containing the hostnames where the DataNodes are running.
</description>
</property>
<property>
<name>test.system.hdfs.clusterprocess.impl.class</name>
<value>org.apache.hadoop.hdfs.test.system.HDFSCluster$HDFSProcessManager</value>
<description>
Cluster process manager for the Hdfs subsystem of the cluster. The value
org.apache.hadoop.hdfs.test.system.HDFSCluster$MultiUserHDFSProcessManager can
be used to enable multi user support.
</description>
</property>
<property>
<name>test.system.hdrc.deployed.scripts.dir</name>
<value>./src/test/system/scripts</value>
<description>
This directory hosts the scripts in the deployed location where
the system test client runs.
</description>
</property>
<property>
<name>test.system.hdrc.hadoopnewconfdir</name>
<value>/opt/hadoop/hadoop-0.21.0/hdfs/build/test/newconf</value>
<description>
The directory where the new config files will be copied to in all
the clusters is pointed out this directory.
</description>
</property>
<property>
<name>test.system.hdrc.suspend.cmd</name>
<value>kill -SIGSTOP</value>
<description>
Command for suspending the given process.
</description>
</property>
<property>
<name>test.system.hdrc.resume.cmd</name>
<value>kill -SIGCONT</value>
<description>
Command for resuming the given suspended process.
</description>
</property>
</configuration>
13. 进入$HADOOP_HOME目录,执行启动HDFS伪分布式集群:
chmod +x bin/*
./bin/start-dfs.sh
14. 回到/opt/hadoop/hadoop-0.21.0/hdfs目录,执行测试用例:
ant test-system –Dhaoop.conf.dir.deployed=$HADOOP_CONF_DIR