根据《hadoop权威指南》,写了一个MapFile的代码:
import java.io.IOException;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.MapFile;
import org.apache.hadoop.io.Text;
public class MapFileTest {
private static final String[] data={
"one,two,three",
"four,five,six",
"seven,eight,nine"
};
public static void main(String[] args) throws IOException {
String uri=args[0];
Configuration conf=new Configuration();
FileSystem fs=FileSystem.get(URI.create(uri),conf);
IntWritable key=new IntWritable();
Text value=new Text();
MapFile.Writer writer=null;
try{
writer=new MapFile.Writer(conf, fs, uri, key.getClass(), value.getClass());
for(int i=0;i<32;i++){
key.set(i+1);
value.set(data[i%data.length]);
writer.append(key, value);
}
}catch(Exception e){
e.printStackTrace();
}finally{
IOUtils.closeStream(writer);
}
}
}
在hadoop机上,用指令编译:
javac -cp /opt/hadoop/hadoop-core-1.2.1.jar MapFileTest.java
然后运行:
hadoop MapFileTest /outputDir
结果提示找不到MapFileTest这个类。
打开bin/hadoop文件查看,将最后一句exec语句echo出来,发现了如下内容:
/usr/java/jdk1.8.0_05/bin/java -Dproc_MapFileTest -Xmx1000m -Dhadoop.log.dir=/opt/hadoop/libexec/../logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/libexec/.. -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console -Dhadoop.security.logger=INFO,NullAppender -Djava.library.path=/opt/hadoop/libexec/../lib/native/Linux-i386-32 -Dhadoop.policy.file=hadoop-policy.xml -classpath /opt/hadoop/libexec/../conf:/usr/java/jdk1.8.0_05/lib/tools.jar:/opt/hadoop/libexec/..:/opt/hadoop/libexec/../hadoop-core-1.0.3.jar:/opt/hadoop/libexec/../lib/asm-3.2.jar:/opt/hadoop/libexec/../lib/aspectjrt-1.6.5.jar:/opt/hadoop/libexec/../lib/aspectjtools-1.6.5.jar:/opt/hadoop/libexec/../lib/commons-beanutils-1.7.0.jar:/opt/hadoop/libexec/../lib/commons-beanutils-core-1.8.0.jar:/opt/hadoop/libexec/../lib/commons-cli-1.2.jar:/opt/hadoop/libexec/../lib/commons-codec-1.4.jar:/opt/hadoop/libexec/../lib/commons-collections-3.2.1.jar:/opt/hadoop/libexec/../lib/commons-configuration-1.6.jar:/opt/hadoop/libexec/../lib/commons-daemon-1.0.1.jar:/opt/hadoop/libexec/../lib/commons-digester-1.8.jar:/opt/hadoop/libexec/../lib/commons-el-1.0.jar:/opt/hadoop/libexec/../lib/commons-httpclient-3.0.1.jar:/opt/hadoop/libexec/../lib/commons-io-2.1.jar:/opt/hadoop/libexec/../lib/commons-lang-2.4.jar:/opt/hadoop/libexec/../lib/commons-logging-1.1.1.jar:/opt/hadoop/libexec/../lib/commons-logging-api-1.0.4.jar:/opt/hadoop/libexec/../lib/commons-math-2.1.jar:/opt/hadoop/libexec/../lib/commons-net-1.4.1.jar:/opt/hadoop/libexec/../lib/core-3.1.1.jar:/opt/hadoop/libexec/../lib/hadoop-capacity-scheduler-1.0.3.jar:/opt/hadoop/libexec/../lib/hadoop-fairscheduler-1.0.3.jar:/opt/hadoop/libexec/../lib/hadoop-thriftfs-1.0.3.jar:/opt/hadoop/libexec/../lib/hsqldb-1.8.0.10.jar:/opt/hadoop/libexec/../lib/jackson-core-asl-1.8.8.jar:/opt/hadoop/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/opt/hadoop/libexec/../lib/jasper-compiler-5.5.12.jar:/opt/hadoop/libexec/../lib/jasper-runtime-5.5.12.jar:/opt/hadoop/libexec/../lib/jdeb-0.8.jar:/opt/hadoop/libexec/../lib/jersey-core-1.8.jar:/opt/hadoop/libexec/../lib/jersey-json-1.8.jar:/opt/hadoop/libexec/../lib/jersey-server-1.8.jar:/opt/hadoop/libexec/../lib/jets3t-0.6.1.jar:/opt/hadoop/libexec/../lib/jetty-6.1.26.jar:/opt/hadoop/libexec/../lib/jetty-util-6.1.26.jar:/opt/hadoop/libexec/../lib/jsch-0.1.42.jar:/opt/hadoop/libexec/../lib/junit-4.5.jar:/opt/hadoop/libexec/../lib/kfs-0.2.2.jar:/opt/hadoop/libexec/../lib/log4j-1.2.15.jar:/opt/hadoop/libexec/../lib/mockito-all-1.8.5.jar:/opt/hadoop/libexec/../lib/oro-2.0.8.jar:/opt/hadoop/libexec/../lib/servlet-api-2.5-20081211.jar:/opt/hadoop/libexec/../lib/slf4j-api-1.4.3.jar:/opt/hadoop/libexec/../lib/slf4j-log4j12-1.4.3.jar:/opt/hadoop/libexec/../lib/xmlenc-0.52.jar:/opt/hadoop/libexec/../lib/jsp-2.1/jsp-2.1.jar:/opt/hadoop/libexec/../lib/jsp-2.1/jsp-api-2.1.jar MapFileTest /outputDir
很明显,缺少当前路径,查看了该脚本的说明,发现在开头部分有说明:
HADOOP_CLASSPATH Extra Java CLASSPATH entries.
于是,找到conf/hadoop-env.xml,将
#export HADOOP_CLASSPATH=
改为:
export HADOOP_CLASSPATH =.
即加上了当前路径,不再是空白。
再次运行,ok。