假设已安装好一个spark镜像 ubuntu:spark
如何在启动容器时就启动 hadoop/ spark 等集群 ( 命令运行 start-yarn.sh, start-all.sh)
只需要编写一个 shell 脚本
由于主机无法获取容器的环境,所以使用 docker run 或 docker exec 等方法在进入容器时执行命令 (start-yarn.sh...)会失败:NOT FOUND COMMAND,因此我采用了一个比较 low 的办法,在 run 或 exec 命令中将 hadoop 和 spark 的执行目录导入容器环境。
- 进入容器获取环境
birenjianmodeMacBook-Pro:spark birenjianmo$ docker run -it ubuntu:spark root@99199dbfe94c:/# echo $PATH /root/soft/apache/spark/spark-2.2.0-bin-hadoop2.7/bin:/root/soft/apache/spark/spark-2.2.0-bin-hadoop2.7/sbin:/opt/temp/jdk1.8.0_211/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/zookeeper/zookeeper-3.4.9/bin:/root/soft/apache/hadoop/hadoop-2.7.7/bin:/root/soft/apache/hadoop/hadoop-2.7.7/sbin:/root/soft/scala/scala-2.11.11/bin:/root/soft/apache/hive/apache-hive-2.3.4-bin/bin
- 在主机编写 shell 脚本,启动容器时就导入环境执行 .sh 就不会报错
PATH="/root/soft/apache/spark/spark-2.2.0-bin-hadoop2.7/bin:/root/soft/apache/spark/spark-2.2.0-bin-hadoop2.7/sbin:/opt/temp/jdk1.8.0_211/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/zookeeper/zookeeper-3.4.9/bin:/root/soft/apache/hadoop/hadoop-2.7.7/bin:/root/soft/apache/hadoop/hadoop-2.7.7/sbin:/root/soft/scala/scala-2.11.11/bin:/root/soft/apache/hive/apache-hive-2.3.4-bin/bin" docker run -itd -h master --name master -p 50070:50070 -p 50030:50030 -p 58088:8088 -p 58080:8080 ubuntu:spark /bin/bash -c "export PATH=$PATH:$PATH && /root/soft/shell/run_master.sh";