Docker搭建hadoop完全分布式集群

一、环境

1、Linux

[root@localhost docker-hadoop]# uname -a
Linux localhost.localdomain 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost docker-hadoop]# cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core) 

2、docker

[root@localhost docker-hadoop]# docker version
Client:
 Version:           18.09.4
 API version:       1.39
 Go version:        go1.10.8
 Git commit:        d14af54266
 Built:             Wed Mar 27 18:34:51 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.4
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.8
  Git commit:       d14af54
  Built:            Wed Mar 27 18:04:46 2019
  OS/Arch:          linux/amd64
  Experimental:     false
[root@localhost docker-hadoop]# 

3、java

java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)

4、hadoop

[root@0a360e41e726 /]# hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
[root@0a360e41e726 /]# 

二、机器规划

3台机器,一主二从
主机名: hadoop2、ip地址: 172.19.0.2  (master)

主机名: hadoop3、ip地址: 172.19.0.3   (slaves)
主机名: hadoop4、ip地址: 172.19.0.4    (slaves)

三、构建镜像

1、构建centos-ssh镜像

注:docker hub上已经有安装好ssh服务的docker:komukomo/centos-sshd,我这里直接拉取,不去自己搭建了

 

运行一个容器:

docker run -itd --name centos-ssh komukomo/centos-sshd /bin/bash

开启ssh服务:

[root@e75b27396db3 /]# /usr/sbin/sshd 
[root@e75b27396db3 /]# 

查看ssh服务是否已开启:

[root@e75b27396db3 /]# netstat -antp | grep sshd
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      19/sshd             
tcp        0      0 :::22                       :::*                        LISTEN      19/sshd             
[root@e75b27396db3 /]# 

设置免密登录:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

验证免密登录是否已生效(密码为root):

[root@e75b27396db3 /]# ssh root@localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
RSA key fingerprint is e5:ab:55:1b:73:c4:51:33:c6:3b:45:a0:b2:34:e7:74.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
[root@e75b27396db3 ~]# ssh root@localhost
Last login: Fri Apr 12 07:20:00 2019 from localhost
[root@e75b27396db3 ~]# exit
logout
Connection to localhost closed.
[root@e75b27396db3 ~]# 

根据该容器构建一个centos-ssh镜像:

[root@localhost ~]# docker commit centos-ssh centos-ssh
sha256:97ef260595ae36d81c9f26b6ed0ed5d13502b7699e079554928ec8cc6fc1b159
[root@localhost ~]# docker images
REPOSITORY             TAG                 IMAGE ID            CREATED             SIZE
centos-ssh             latest              97ef260595ae        4 seconds ago       410MB
centos7-ssh            latest              4e4796f7e8ef        About an hour ago   289MB
<none>                 <none>              e08ee32cfd93        About an hour ago   289MB
<none>                 <none>              3cff40060339        About an hour ago   289MB
centos-tools           latest              bb563754f296        4 hours ago         391MB
jquery134/mycentos     v1.0                8c63d14863d3        4 days ago          354MB
tomcat                 latest              f1332ae3f570        13 days ago         463MB
nginx                  latest              2bcb04bdb83f        2 weeks ago         109MB
centos                 latest              9f38484d220f        4 weeks ago         202MB
ubuntu                 latest              94e814e2efa8        4 weeks ago         88.9MB
jdeathe/centos-ssh     latest              f68976440f24        6 weeks ago         226MB
komukomo/centos-sshd   latest              d969d0bdc7ac        2 years ago         289MB
[root@localhost ~]# 

2、根据centos-ssh镜像构建hadoop镜像

构建时宿主机上的目录结构:

Dockerfile文件内容:

FROM centos-ssh
ADD jdk-8u101-linux-x64.tar.gz /usr/local/
RUN mv /usr/local/jdk1.8.0_101 /usr/local/jdk1.8
ENV JAVA_HOME /usr/local/jdk1.8
ENV PATH $JAVA_HOME/bin:$PATH

ADD hadoop-2.7.3.tar.gz /usr/local
RUN mv /usr/local/hadoop-2.7.3 /usr/local/hadoop
ENV HADOOP_HOME /usr/local/hadoop
ENV PATH $HADOOP_HOME/bin:$PATH

RUN yum install -y which sudo

构建hadoop镜像:

[root@localhost Hadoop]# docker build -t="hadoop" .

3、根据hadoop镜像创建三个容器,并在容器中开启ssh服务

创建自定义网络:

[root@localhost Hadoop]# docker network create --subnet=172.19.0.0/16 mynetwork
522bc0ed2d6048e5f303245d0c85ae36e62d0735f1d2e9ca5c73a11f103c1954
[root@localhost Hadoop]# docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
76cf156331a1        bridge              bridge              local
2059529b97fd        bridge1             bridge              local
2c43b9b438d5        host                host                local
522bc0ed2d60        mynetwork           bridge              local
6c75caf7d102        none                null                local
[root@localhost Hadoop]# 

运行容器:

[root@localhost Hadoop]# docker run -itd --name hadoop2  --net mynetwork  --ip 172.19.0.2 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -p 8088:8088 -p 9000:9000 -p 50070:50070 -p 9001:9001 -p 8030:8030 -p 8031:8031 -p 8032:8032 -p 8033:8033 -p 10020:10020 -p 19888:19888 jquery134/hadoop /bin/bash
10c1a242c22efd92d8f9007f4f51f5ff6c9e4511daa6d5fd29152ab1ac43c0e5
[root@localhost Hadoop]# docker run -itd --name hadoop3  --net mynetwork  --ip 172.19.0.3 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P jquery134/hadoop /bin/bash
8276aa51a9584ba23aab9cbcc069a157ea34f95cb21eba67189f1bc7347cca81
[root@localhost Hadoop]# docker run -itd --name hadoop4  --net mynetwork  --ip 172.19.0.4 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P jquery134/hadoop /bin/bash
ea17f5a50d5a1c5e2effe26c84e93387440debb91316026a9c7f5dc3700cca56
[root@localhost Hadoop]# 

分别开启三个容器的ssh服务:

[root@localhost Hadoop]# docker exec -d hadoop2 /usr/sbin/sshd
[root@localhost Hadoop]# docker exec -d hadoop3 /usr/sbin/sshd
[root@localhost Hadoop]# docker exec -d hadoop4 /usr/sbin/sshd

验证环境包括java、hadoop、ssh、网络连通性、免密登录:

[root@10c1a242c22e /]# java -version
java version "1.8.0_101"
Java(TM) SE Runtime Environment (build 1.8.0_101-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
[root@10c1a242c22e /]# javac -version
javac 1.8.0_101
[root@10c1a242c22e /]# ssh root@172.19.0.3
Last login: Fri Apr 12 08:07:46 2019 from hadoop2
[root@8276aa51a958 ~]# exit;
logout
Connection to 172.19.0.3 closed.
[root@10c1a242c22e /]# hadoop version
Hadoop 2.7.3
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff
Compiled by root on 2016-08-18T01:41Z
Compiled with protoc 2.5.0
From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-2.7.3.jar
[root@10c1a242c22e /]# ping hadoop3
PING hadoop3 (172.19.0.3) 56(84) bytes of data.
64 bytes from hadoop3 (172.19.0.3): icmp_seq=1 ttl=64 time=0.248 ms
64 bytes from hadoop3 (172.19.0.3): icmp_seq=2 ttl=64 time=0.145 ms
^C
--- hadoop3 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1936ms
rtt min/avg/max/mdev = 0.145/0.196/0.248/0.053 ms
[root@10c1a242c22e /]# ping hadoop4
PING hadoop4 (172.19.0.4) 56(84) bytes of data.
64 bytes from hadoop4 (172.19.0.4): icmp_seq=1 ttl=64 time=0.233 ms
64 bytes from hadoop4 (172.19.0.4): icmp_seq=2 ttl=64 time=0.095 ms
^C
--- hadoop4 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1754ms
rtt min/avg/max/mdev = 0.095/0.164/0.233/0.069 ms
[root@10c1a242c22e /]# ping hadoop4

 

4、配置hadoop

在/usr/local/hadoop/etc/hadoop/hadoop-env.sh中,添加JAVA_HOME信息:

 export JAVA_HOME=/usr/local/jdk1.8

 

 core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>fs.default.name</name>
		<value>hdfs://hadoop2/</value>
	</property>
	<property>
		<name>io.file.buffer.size</name>
		<value>131072</value>
	</property>
	<property>
		<name>hadoop.tmp.dir</name>
		<value>/home/hadoop/tmp</value>
		<description>Abase for other temporary directories.</description>
	</property>
</configuration>

wAAACH5BAEKAAAALAAAAAABAAEAAAICRAEAOw==

hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>dfs.namenode.secondary.http-address</name>
		<value>hadoop2:9001</value>
		<description># 通过web界面来查看HDFS状态 </description>
	</property>
	<property>
		<name>dfs.namenode.name.dir</name>
		<value>/home/hadoop/dfs/name</value>
	</property>
	<property>
		<name>dfs.datanode.data.dir</name>
		<value>/home/hadoop/dfs/data</value>
	</property>
	<property>
		<name>dfs.replication</name>
		<value>2</value>
		<description># 每个Block有2个备份</description>
	</property>
		<property>
		<name>dfs.webhdfs.enabled</name>
		<value>true</value>
	</property>
</configuration>

mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
	<property>
		<name>mapreduce.framework.name</name>
		<value>yarn</value>
	</property>
	<property>
		<name>mapreduce.jobhistory.address</name>
		<value>hadoop2:10020</value>
	</property>
	<property>
		<name>mapreduce.jobhistory.webapp.address</name>
		<value>hadoop2:19888</value>
	</property>
</configuration>

yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
	<!-- Site specific YARN configuration properties -->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<property>
		<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
		<value>org.apache.hadoop.mapred.ShuffleHandler</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address</name>
		<value>hadoop2:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.address</name>
		<value>hadoop2:8030</value>
	</property>
	<property>
		<name>yarn.resourcemanager.resource-tracker.address</name>
		<value>hadoop2:8031</value>
	</property>
	<property>
		<name>yarn.resourcemanager.admin.address</name>
		<value>hadoop2:8033</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address</name>
		<value>hadoop2:8088</value>
	</property>
	<property>
		<name>yarn.nodemanager.resource.memory-mb</name>
		<value>1024</value>
	</property>
	<property>
		<name>yarn.nodemanager.resource.cpu-vcores</name>
		<value>1</value>
	</property>
</configuration>

slaves

hadoop3
hadoop4

将配置好的hadoop拷贝好到hadoop3、hadoop4中:

scp  -rq /usr/local/hadoop   hadoop3:/usr/local
scp  -rq /usr/local/hadoop   hadoop4:/usr/local

执行格式化:

bin/hdfs namenode -format

在master主机上执行start-all.sh脚本启动集群:

查看集群启动结果:

hadoop2上

[root@10c1a242c22e bin]# jps
643 ResourceManager
310 NameNode
492 SecondaryNameNode
956 Jps
[root@10c1a242c22e bin]# 

hadoop3上

[root@8276aa51a958 /]# jps
369 Jps
153 DataNode
250 NodeManager
[root@8276aa51a958 /]# 

hadoop4上

[root@ea17f5a50d5a /]# jps
144 NodeManager
263 Jps
47 DataNode
[root@ea17f5a50d5a /]# 

 

注:可以将hadoop2、hadoop3、hadoop4提交为镜像,方便以后修改端口映射等操作:

[root@localhost Hadoop]# docker run -itd --name hadoop2  --net mynetwork  --ip 172.19.0.2 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -p 8088:8088 -p 50070:50070 -p 19888:19888 hadoop2 /bin/bash
10c1a242c22efd92d8f9007f4f51f5ff6c9e4511daa6d5fd29152ab1ac43c0e5
[root@localhost Hadoop]# docker run -itd --name hadoop3  --net mynetwork  --ip 172.19.0.3 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P hadoop3 /bin/bash
8276aa51a9584ba23aab9cbcc069a157ea34f95cb21eba67189f1bc7347cca81
[root@localhost Hadoop]# docker run -itd --name hadoop4  --net mynetwork  --ip 172.19.0.4 --add-host hadoop2:172.19.0.2 --add-host hadoop3:172.19.0.3 --add-host hadoop4:172.19.0.4 -d -P hadoop4 /bin/bash
ea17f5a50d5a1c5e2effe26c84e93387440debb91316026a9c7f5dc3700cca56
[root@localhost Hadoop]# 

 

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值