1,在ubuntu上搭建了hadoop伪分布式系统
- jdk安装
- jdk下载:http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
- 解压文件,修改文件名
sudo mkdir /usr/lib/jvm sudo tar zxvf jdk-7u80-linux-x64.tar.gz -C /usr/lib/jvm cd /usr/lib/jvm sudo mv jdk1.7.0_80 java
- 添加环境变量,用sublime打开个性化设置文件:$ sudo subl ~/.bashrc,加入如下内容:
export JAVA_HOME=/usr/lib/jvm/java export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH
- 配置默认jdk版本
sudo update-alternatives --install /usr/bin/java java /usr/lib/jvm/java/bin/java 300 sudo update-alternatives --install /usr/bin/javac javac /usr/lib/jvm/java/bin/javac 300 sudo update-alternatives --install /usr/bin/jar jar /usr/lib/jvm/java/bin/jar 300 sudo update-alternatives --install /usr/bin/javah javah /usr/lib/jvm/java/bin/javah 300 sudo update-alternatives --install /usr/bin/javap javap /usr/lib/jvm/java/bin/javap 300 sudo update-alternatives --config java
- 系统会提示选择默认java版本
- 测试:$ java -version
- hadoop安装
- 官网下载:http://hadoop.apache.org/releases.html,选择binary即可,后续会尝试从源码build
- 解压hadoop到制定目录:
tar -zxvf hadoop-2.2.6.tar.gz -C /home/workspace
- 修改/etc/hadoop/core-site.xml,在标签 <configuration></configuration>中添加如下内容:
<property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/zs/workspace/hadoop-2.6.2/tmp</value> </property>
- 修改/etc/hadoop/mapred-site.xml,添加如下属性:
<property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property>
- 格式化hadoop文件系统并启动hadoop,用jps命令查看运行状况(NameNode、SecondaryNameNode、DataNode、NodeManager、ResourceManager)
bin/hadoop namenode -format bin/start-all.sh
- wordcount例程运行
- 创建本地输入文件
mkdir ~/files cd ~/files echo "hello hadoop java" > file1.txt echo "hello hadoop zs" > file2.txt
- 本地文件hdfs输入文件
bin/hadoop fs -mkdir /input bin/hadoop fs -put ~/files/file*.txt /input
- 运行wordcount并查看结果
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.2.jar wordcount /input/ /output/ bin/hadfs dfs -cat /output/*
- 创建本地输入文件