Ubuntu18.04安装伪分布式Hadoop3.3.1

Hadoop3.3.1 Pseudo Distributed Mode on Ubuntu 18.04

The operating system in this tutorial is Ubuntu18.04.6. The steps to install Ubuntu18.04 is omitted.

1. Create user hadoop

Open the terminal and type in command below to create new user:

sudo useradd -m hadoop -s /bin/bash

This command creates a log-in user hadoop and uses /bin/bash as shell.

Set up password for user hadoop:

sudo passwd hadoop

Give sudo permission to user hadoop:

sudo adduser hadoop sudo

Switch Linux login user (via Ubuntu UI) to hadoop to process steps below.

upgrade apt

sudo apt-get update

install vim

sudo apt-get install vim

install ssh, set up ssh none-key login

sudo apt-get install openssh-server

login localhost

ssh localhost

exit localhost

exit

authorize the key

cd ~/.ssh/
ssh-keygen -t rsa
cat ./id_rsa.pub >> ./authorized_keys

2. Install Java

sudo apt-get install openjdk-8-jre openjdk-8-jdk

change environment variables

cd ~
vim ~/.bashrc

add details below to it

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

save changes, exit vim and refresh .bashrc

source ~/.bashrc

check java installing successfully

java -version

If details are shown on the screen, the installation is successful.

3. Install Hadoop3.3.1

change directory

cd /usr/local

download hadoop3.3.1

sudo wget http://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

unzip and change filename, ownership

tar xzf hadoop-3.3.1.tar.gz
mv hadoop-3.3.1 hadoop
sudo chown -R hadoop:hadoop ./hadoop

4. Configure pseudo distributed mode

cd ~
vim ~/.bashrc

add details below to .bashrc

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:${HADOOP_HOME}/bin
export PATH=$PATH:${HADOOP_HOME}/sbin
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}

refresh to activate new environment variables

source ~/.bashrc

Configure several files in path :/usr/local/hadoop/etc/hadoop/

  • hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
  • core-site.xml
<configuration>
	<property>
    	<name>hadoop.tmp.dir</name>
    <value>file:/usr/local/hadoop/tmp</value>
    	<description>Abase for other temporary directories.</description>
	</property>
	<property>
    	<name>fs.defaultFS</name>
    	<value>hdfs://localhost:9000</value>
	</property>
</configuration>
  • hdfs-site.xml
<configuration>
	<property>
    	<name>dfs.replication</name>
    	<value>1</value>
	</property>
	<property>
    	<name>dfs.namenode.name.dir</name>
    	<value>file:/usr/local/hadoop/tmp/dfs/name</value>
	</property>
	<property>
    	<name>dfs.datanode.data.dir</name>
    	<value>file:/usr/local/hadoop/tmp/dfs/data</value>
	</property>
</configuration>
  • mapred-site.xml
<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<property>
		<name>yarn.app.mapreduce.am.env</name>
	  <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
	</property>
	<property>
	  <name>mapreduce.map.env</name>
	  <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
	</property>
	<property>
	  <name>mapreduce.reduce.env</name>
	  <value>HADOOP_MAPRED_HOME=/usr/local/hadoop</value>
	</property>
</configuration>
  • yarn-site.xml
<configuration>
<property>
	<name>yarn.nodemanager.aux-services</name>
	<value>mapreduce_shuffle</value>
</property>
<property>
	<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
	<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>

format NameNode:

hdfs namenode -format

start hdfs

start-dfs.sh

start yarn

start-yarn.sh

to certify that dfs and yarn started successfully:

jps

if it looks like below (all parts should show up), previous steps are successful

$jps
2961 ResourceManager
2482 DataNode
3077 NodeManager
2366 NameNode
2686 SecondaryNameNode
3199 Jps

if namenode doesn’t show up, you need tou format the namenode. Please make sure there is no important files in HDFS, beacause everything will be cleared.

cd /usr/local/hadoop
./sbin/stop-dfs.sh   # stop HDFS
rm -r ./tmp     # note: this will clear all files in HDFS
./bin/hdfs namenode -format # format namenode  
./sbin/start-dfs.sh  # restart

5. Job Test

try to run an example-jar file

cd usr/local/hadoop/share/hadoop/mapreduce/
hadoop jar ./hadoop-mapreduce-examples-3.3.1.jar pi 5 10

open browser to see web UI pages

urls:

localhost:9870
localhost:8088

You can also access the web page on other PC. First, find the IP address of the ubuntu.

ifconfig -a

The information showed up gives ip address. You can access the web page on other PC’s browsers.

${IP ADDRESS}:9870
${IP ADDRESS}:8088

note: the job history port is 19888

6. Shut down

use command separately :

stop-dfs.sh
stop-yarn.sh

Or stop them together:

stop-all.sh
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值