参考链接:
GitHub - HillBamboo/hadoop-cluster-docker: Run Hadoop Custer within Docker Containers
1. pull docker image
sudo docker pull kiwenlau/hadoop:1.0 |
2. clone github repository
git clone https://github.com/kiwenlau/hadoop-cluster-docker |
3. create hadoop network
sudo docker network create --driver=bridge hadoop |
4. start container
cd hadoop-cluster-docker |
output:
start hadoop-master container... |
- start 3 containers with 1 master and 2 slaves
- you will get into the /root directory of hadoop-master container
5. start hadoop
./start-hadoop.sh |
此时集群已经启动了
以下均在hadoop-master容器中操作
新建code文件夹
mkdir code |
向docker 中传入文件
docker cp wordcount_mapreduce hadoop-master:/root/code/ |
进入wordcount_mapreduce 目录
cd code/wordcount_mapreduce |
向hadoop中传入需要使用的文件
hadoop fs -put Harry_Potter_and_the_Sorcerers_Stone.txt / |
执行 run.sh
bash run.sh |
其中,run.sh根据环境变量有所修改,详见下面文件夹
05hadoop/wordcount_mapreduce_docker