Version:hadoop-1.0.3
1. download the newest version
http://www.apache.org/dyn/closer.cgi/hadoop/common/
2. unzip the downloaded zip file.
3. configuration
3.1) configure core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://[hostname]:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/[username]/Hadoop/hadoop-1.0.3/tmp</value>
</property>
</configuration>
If we don't specify the parameter:
hadoop.tmp.dir , each time we restart the hadoop cluster, we need to reformat the hadoop system.
3.2) hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
3.3) mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>[<span style="font-family:Arial, Helvetica, sans-serif;">hostname]</span>:9001</value>
</property>
</configuration>
3.4) hadoop-env.sh
export JAVA_HOME=/opt/java/jdk1.7.0_51
3.5) masters and slaves
Since we set up a Pseudo-Distributed Mode, we change the file content from localhost to [hostname] for these 2 files.
4. Format the new distributed-system
In the terminal, input the following command:
$ bin/hadoop namenode -format
5. Setup passphraseless ssh
Check if you can ssh to the your hostname without a passphrase:
$ ssh [hostname]
If you cannot ssh to [hostname] without a passphrase, execute the following commands:
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
Possible problem:
ssh: connect to host localhost port 22: Connection refused
It should be from ssh, sshd not installed or firewall blocked.
install ssh:
$ sudo apt-get install ssh
install sshd:
$ sudo apt-get install openssh-server<span><span>
$ sudo net start sshd</span></span>
disable firewall
$ sudo ufw disable
6. start the hadoop
$ bin/start-all.sh
7. Succeed
1) Check from the terminal:
in the terminal, input:
$ jps
As we can see that, jobtracker, namenode, datanode, secondaryNamenode, taskTracker have already been started.
2) Check from the webpage:
namenode :
http://[hostname]:50070/
jobtracker:
http://[hostname]:50030/