Hadoop分布式集群介绍
本次Hadoop搭建使用GitHub - kiwenlau/hadoop-cluster-docker: Run Hadoop Custer within Docker Containers进行。
首先需要明确Hadoop集群的启动顺序如下:
namenode -> datanode -> resourcemanager -> nodemanager -> historyserver
可使用浏览器通过如下地址监视hadoop集群的状况:
- Namenode: http://<dockerhadoop_IP_address>:9870/dfshealth.html#tab-overview
- History server: http://<dockerhadoop_IP_address>:8188/applicationhistory
- Datanode: http://<dockerhadoop_IP_address>:9864/
- Nodemanager: http://<dockerhadoop_IP_address>:8042/node
- Resource manager: http://<dockerhadoop_IP_address>:8088/
准备docker-compose相关文件
准备docker-compose.yml如下所示:
version: "3"
services:
namenode:
image: bde2020/hadoop-namenode:2.0.0-hadoop3.1.3-java8
container_name: namenode
ports:
- 9870:9870
volumes:
- hadoop_namenode:/hadoop/dfs/name
environment:
- CLUSTER_NAME