关闭

docker搭建hadoop分布式集群

标签: hadoopcentos集群
72人阅读 评论(0) 收藏 举报
分类:

docker搭建hadoop分布式集群

1. 环境准备

1.1 Centos7操作系统

1.2 centos7中安装docker

参考【CentOS7安装docker.md】

1.3 关闭selinux

[root@#localhost /]# vi /etc/selinux/config 
SELINUX=disabled

2. 构建hadoop基础镜像

使用dockerfile文件方式进行构建

[root@#localhost ~]# mkdir /docker
[root@#localhost ~]# cd /docker
[root@#localhost docker]# mkdir centos-ssh-root
[root@#localhost docker]# cd centos-ssh-root/
[root@#localhost centos-ssh-root]# vi dockerfile
# 选择一个已有的os镜像作为基础  
FROM centos 

# 镜像的作者  
MAINTAINER hzk 

# 安装openssh-server和sudo软件包,并且将sshd的UsePAM参数设置成no  
RUN yum install -y openssh-server sudo  
RUN sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config  
#安装openssh-clients
RUN yum  install -y openssh-clients

# 添加测试用户root,密码root,并且将此用户添加到sudoers里  
RUN echo "root:root" | chpasswd  
RUN echo "root   ALL=(ALL)       ALL" >> /etc/sudoers  
# 下面这两句比较特殊,在centos6上必须要有,否则创建出来的容器sshd不能登录  
RUN ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key  
RUN ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key  

# 启动sshd服务并且暴露22端口  
RUN mkdir /var/run/sshd  
EXPOSE 22  
CMD ["/usr/sbin/sshd", "-D"]
[root@#localhost centos-ssh-root]# docker build -t="hzk/centos-ssh-root" .

查询刚才构建成功的镜像

[root@#localhost centos-ssh-root-jdk]# docker images
REPOSITORY                TAG                 IMAGE ID            CREATED              SIZE
hzk/centos-ssh-root       latest              e5e59dc176a8        28 minutes ago       315.9 MB
job1                      latest              b9d248c57b45        About an hour ago    1.093 MB
centos                    latest              970633036444        4 weeks ago          196.7 MB
hello-world               latest              c54a2cc56cbb        8 weeks ago          1.848 kB
busybox                   latest              2b8fd9751c4c        9 weeks ago          1.093 MB

3. 基于基础镜像再构建一个带有jdk的镜像

jdk使用的是1.7版本

[root@#localhost /]# cd /docker/
[root@#localhost docker]# mkdir centos-ssh-root-jdk
[root@#localhost docker]# cd centos-ssh-root-jdk/
[root@#localhost centos-ssh-root-jdk]# cp -rvf /usr/local/java7/ .
[root@#localhost centos-ssh-root-jdk]# vi dockerfile
FROM hzk/centos-ssh-root
ADD java7 /usr/local
ENV JAVA_HOME /usr/local/java7
ENV PATH $JAVA_HOME/bin:$PATH
[root@#localhost centos-ssh-root-jdk]# docker build -t="hzk/centos-ssh-root-jdk" .

查询刚才构建成功的镜像

[root@#localhost centos-ssh-root-jdk]# docker images
REPOSITORY                TAG                 IMAGE ID            CREATED              SIZE
hzk/centos-ssh-root-jdk   latest              06efe31698fd        About a minute ago   622.2 MB
hzk/centos-ssh-root       latest              e5e59dc176a8        28 minutes ago       315.9 MB
job1                      latest              b9d248c57b45        About an hour ago    1.093 MB
centos                    latest              970633036444        4 weeks ago          196.7 MB
hello-world               latest              c54a2cc56cbb        8 weeks ago          1.848 kB
busybox                   latest              2b8fd9751c4c        9 weeks ago          1.093 MB

4. 基于这个jdk镜像再构建一个带有hadoop的镜像

[root@#localhost centos-ssh-root-jdk]# cd ..
[root@#localhost docker]# mkdir centos-ssh-root-jdk-hadoop
[root@#localhost centos-ssh-root-jdk-hadoop]# cp /mnt/usb/hadoop-2.5.2.tar.gz .
[root@#localhost centos-ssh-root-jdk-hadoop]# vi dockerfile
FROM hzk/centos-ssh-root-jdk
ADD hadoop-2.5.2.tar.gz /usr/local
RUN mv /usr/local/hadoop-2.5.2 /usr/local/hadoop
ENV HADOOP_HOME /usr/local/hadoop
ENV PATH $HADOOP_HOME/bin:$PATH
[root@#localhost centos-ssh-root-jdk-hadoop]# docker build -t="hzk/centos-ssh-root-jdk-hadoop" .

查询刚才构建成功的镜像

[root@#localhost centos-ssh-root-jdk-hadoop]# docker images
REPOSITORY                       TAG                 IMAGE ID            CREATED             SIZE
hzk/centos-ssh-root-jdk-hadoop   latest              df8e55fd8b53        8 minutes ago       1.124 GB
hzk/centos-ssh-root-jdk          latest              06efe31698fd        48 minutes ago      622.2 MB
hzk/centos-ssh-root              latest              e5e59dc176a8        About an hour ago   315.9 MB
job1                             latest              b9d248c57b45        2 hours ago         1.093 MB
centos                           latest              970633036444        4 weeks ago         196.7 MB
hello-world                      latest              c54a2cc56cbb        8 weeks ago         1.848 kB
busybox                          latest              2b8fd9751c4c        9 weeks ago         1.093 MB

5.搭建hadoop分布式集群

5.1 集群规划

准备搭建一个具有三个节点的集群,一主两从

主节点:hadoop0 ip:192.168.2.10

从节点1:hadoop1 ip:192.168.2.11

从节点2:hadoop2 ip:192.168.2.12

但是由于docker容器重新启动之后ip会发生变化,所以需要我们给docker设置固定ip。使用pipework给docker容器设置固定ip

5.2 启动三个容器,分别作为hadoop0 hadoop1 hadoop2

在宿主机上执行下面命令,给容器设置主机名和容器的名称,并且在hadoop0中对外开放端口50070 和8088

[root@#localhost ~]# docker run --name hadoop0 --hostname hadoop0 -d -P -p 50070:50070 -p 8088:8088 hzk/centos-ssh-root-jdk-hadoop
[root@#localhost ~]# docker run --name hadoop1 --hostname hadoop1 -d -P hzk/centos-ssh-root-jdk-hadoop
[root@#localhost ~]# docker run --name hadoop2 --hostname hadoop2 -d -P hzk/centos-ssh-root-jdk-hadoop

使用docker ps 查看刚才启动的是三个容器

[root@#localhost ~]# docker ps
CONTAINER ID        IMAGE                            COMMAND               CREATED              STATUS              PORTS                                                                     NAMES
d8e29afb5459        hzk/centos-ssh-root-jdk-hadoop   "/usr/sbin/sshd -D"   55 seconds ago       Up 54 seconds       0.0.0.0:32770->22/tcp                                                     hadoop2
acbdd32514eb        hzk/centos-ssh-root-jdk-hadoop   "/usr/sbin/sshd -D"   About a minute ago   Up About a minute   0.0.0.0:32769->22/tcp                                                     hadoop1
c82f06555e1a        hzk/centos-ssh-root-jdk-hadoop   "/usr/sbin/sshd -D"   About a minute ago   Up About a minute   0.0.0.0:8088->8088/tcp, 0.0.0.0:50070->50070/tcp, 0.0.0.0:32768->22/tcp   hadoop0
5.3 给这三台容器设置固定IP

下载pipework

[root@#localhost ~]# wget https://codeload.github.com/jpetazzo/pipework/zip/master[root@#localhost ~]# wget https://codeload.github.com/jpetazzo/pipework/zip/master
[root@#localhost ~]# unzip master 
[root@#localhost ~]# mv pipework-master/ pipework
[root@#localhost ~]# cp -rp pipework/pipework /usr/local/bin/

安装bridge-utils

[root@#localhost ~]# yum -y install bridge-utils

创建网络

[root@#localhost ~]# brctl addbr br0
[root@#localhost ~]# ip link set dev br0 up
[root@#localhost ~]# ip addr add 192.168.2.1/24 dev br0

给容器设置固定ip

[root@#localhost ~]# pipework br0 hadoop0 192.168.2.10/24
[root@#localhost ~]# pipework br0 hadoop1 192.168.2.11/24
[root@#localhost ~]# pipework br0 hadoop2 192.168.2.12/24

验证一下,分别ping三个ip,能ping通就说明没问题

[root@#localhost ~]# ping 192.168.2.10
PING 192.168.2.10 (192.168.2.10) 56(84) bytes of data.
64 bytes from 192.168.2.10: icmp_seq=1 ttl=64 time=0.257 ms
64 bytes from 192.168.2.10: icmp_seq=2 ttl=64 time=0.100 ms
64 bytes from 192.168.2.10: icmp_seq=3 ttl=64 time=0.114 ms
^C
--- 192.168.2.10 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.100/0.157/0.257/0.070 ms
[root@#localhost ~]# ping 192.168.2.11
PING 192.168.2.11 (192.168.2.11) 56(84) bytes of data.
64 bytes from 192.168.2.11: icmp_seq=1 ttl=64 time=0.323 ms
64 bytes from 192.168.2.11: icmp_seq=2 ttl=64 time=0.102 ms
64 bytes from 192.168.2.11: icmp_seq=3 ttl=64 time=0.109 ms
64 bytes from 192.168.2.11: icmp_seq=4 ttl=64 time=0.113 ms
^C
--- 192.168.2.11 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3002ms
rtt min/avg/max/mdev = 0.102/0.161/0.323/0.094 ms
[root@#localhost ~]# ping 192.168.2.12
PING 192.168.2.12 (192.168.2.12) 56(84) bytes of data.
64 bytes from 192.168.2.12: icmp_seq=1 ttl=64 time=0.239 ms
64 bytes from 192.168.2.12: icmp_seq=2 ttl=64 time=0.116 ms
64 bytes from 192.168.2.12: icmp_seq=3 ttl=64 time=0.100 ms
64 bytes from 192.168.2.12: icmp_seq=4 ttl=64 time=0.103 ms
^C
--- 192.168.2.12 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.100/0.139/0.239/0.058 ms
5.4 配置hadoop集群

先连接到hadoop0上

[root@#localhost ~]# docker exec -it hadoop0 /bin/bash

设置主机名与ip的映射,修改三台容器:vi /etc/hosts

[root@hadoop0 /]# vi /etc/hosts
192.168.2.10    hadoop0
192.168.2.11    hadoop1
192.168.2.12    hadoop2

设置ssh免密码登录 (一路回车)

[root@hadoop0 /]# cd 
[root@hadoop0 ~]# ssh-keygen -t rsa

[root@hadoop0 ~]# ssh-copy-id -i localhost
[root@hadoop0 ~]# ssh-copy-id -i hadoop0
[root@hadoop0 ~]# ssh-copy-id -i hadoop1
[root@hadoop0 ~]# ssh-copy-id -i hadoop2

在hadoop1上执行下面操作

[root@hadoop1 /]# cd 
[root@hadoop1 ~]# ssh-keygen -t rsa

[root@hadoop1 ~]# ssh-copy-id -i localhost

在hadoop2上执行下面操作

[root@hadoop1 ~]# exit
exit
[root@#localhost ~]# docker exec -it hadoop2 /bin/bash
[root@hadoop2 /]# cd 
[root@hadoop2 ~]# ssh-keygen -t rsa

[root@hadoop2 ~]# ssh-copy-id -i localhost
[root@hadoop2 ~]# ssh-copy-id -i hadoop2
0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:2590次
    • 积分:113
    • 等级:
    • 排名:千里之外
    • 原创:9篇
    • 转载:0篇
    • 译文:0篇
    • 评论:1条
    文章分类
    文章存档
    最新评论