Install hadoop3.* on ubuntu (simplified version)
We assume that you have had java and ssh already
Download
We can find hadoop here
And download the hadoop3 here and extract itChoosing a place for hadoop
For me I just placed it in~/Documents/hadoop/
and that’s it.Configuration
gedit hadoop-env.sh
, findexport JAVA_HOME=
and complete it with the path in $JAVA_HOME$ gedit core-site.xml
and add the following code between the configuration tags
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>$ gedit mapred-site.xml.template
and add the following code between the configuration tags
<property>
<name>mapred.job.tracker</name>
<value>hdfs://localhost:9001</value>
</property>$ gedit hdfs-site.xml
and add the following code between the configuration tags
<property>
<name>dfs.namenode.name.dir</name>
<value>home/konroy/Documents/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/konroy/Documents/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>localhost:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>- done!
Try out the ssh
test the ssh with$ ssh localhost
and then generate the ssh-key with$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
put it to the authorize_key with$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
change the mode withchmod 0600 ~/.ssh/authorized_keys
Run hadoop
- Enter the hadoop directory and
bin/hdfs namenode -format
sbin/start-dfs.sh
- check it with
jps
and normally you will see 4 items includingnamenode; datanode; secondarydatanode; jps
- to stop hadoop enter
sbin/stop-dfs.sh
- Enter the hadoop directory and
Some differences
We used to browse hadoop on port 50070 when using the older version. On hadoop3.* , however, the port is changed to 9870. As a result, to connect the hadoop page we should use http://localhost:9870