Installing and configuring the Hadoop in the VM environment (1)
Premilinaries:
1. Update the source list: apt-get update
2.Install the build-essential to obtain the basic developing tools: apt-get install build-essential
3.Install the editor vim: apt-get install vim
Install the wget to download the hadoop pakage: apt-get install wget
Installing the JDK
Because Hadoop is wriiten in JAVA, we must install the Java Development Kit to support it.
1.Install the default version of Java: apt-get install default-jdk
2.Check the installing path of Java: dpkg -L openjdk-7-jdk | grep '/bin/javac'
3.Configuring the environment variale of JAVA_HOME:
vim ~/.bashrc
;update-alternatives --config java
4.Append the following to the end of ~/.bashrc
export JAVA\_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
5.Excuting the ~/.bashrc to make it valid: source ~/.bashrc
6.Check if we configure correctly:
echo $JAVA_HOME
java -version
JAVA_HOME/bin/java -version
If what java -version echo is same to JAVA_HOME/bin/java -version, we cinfigure it correctly.
Create Hadoop user
1.Add a group: addgroup hadoop
2.Add hadoop user: adduser --ingroup hadoop hduser
3.Add the administrator priority for hduser: adduser hduser sudo
4.Login hduser: su hduser
5.what follows is under hduser
Installing ssh
1.Install ssh: apt-get install ssh
2.Create and Setup SSH certificates:
ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
3.Check if ssh works: ssh localhost
Install Hadoop
1.Download the Hadoop pakage:
wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
2.Decode the hadoop pakage: tar xvzf hadoop-2.6.0.tar.gz
3.Move the Hadoop installation to the /usr/local/hadoop directory:
cd hadoop-2.6.0
; sudo mv * /usr/local/hadoop
sudo chown -R hduser:hadoop /usr/local/hadoop
4.If hadoop doesn’t exist, make a folder for it: sudo mkdir /usr/local/hadoop
Configuring the hadoop
A.The following files will have to be motified to complete the hadoop setup:
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.template
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
B.Modifying the hadoop-env.sh (About line 25):
vim /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
C.core-site.xml:
sudo mkdir -p /app/hadoop/tmp
sudo mkdir -p /app/hadoop/tmp
vim /usr/local/hadoop/etc/hadoop/core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
D.mapred-site/xml
1.By default, the /usr/local/hadoop/etc/hadoop/ folder contains /usr/local/hadoop/etc/hadoop/mapred-site.xml.template. We need to modify it into mapred-site.xml: cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml
Adding the following at the end of this file:
vim /usr/local/hadoop/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
</configuration>
E.hdfs-site.xml
1.It is used to specify the directories which will be used as the namenode and the datanode on that host. Thus, before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation.
sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
sudo chown -R hduser:hadoop /usr/local/hadoop_store
2.Add the fillowing at the end of this file:
vim /usr/local/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
Format the new hadoop file system
1.Format the hadoop file system: hadoop namenode -format
2.If format sucessfully, it will create a current folder in the namenode folder: ls /usr/local/hadoop_store/hdfs/namenode
Control hadoop
1.Start: cd /usr/local/hadoop/sbin; start-all.sh
2.Check: jps
3.Stop: stop-all.sh