在大数据教学实践中,经常需要采用虚拟化技术模拟大数据集群,并学习各项大数据技术栈路线。通过一些自动化技术,方便我们反复创建各种学习环境,使得我们能够专注各项技术栈学习,克服创建环境和配置的重复而复杂的过程。
一 Vagrant虚拟集群的创建
使用Vagrant能够一键实现从一个基本操作系统镜像构建一个虚拟化大数据集群。关于使用Vagrant创建集群的方法,请参考本文作者文章(112条消息) Windows下虚拟机的自动化管理_liu9ang的博客-CSDN博客。我们的大数据集群规划是:
master 192.168.99.11
node02 192.168.99.12
node03 192.168.99.13
mysqls 192.168.99.15
在C:\下创建vagrant目录,创建VagrantFile文件。
$lasthost = "master"
Vagrant.configure("2") do |config|
config.vm.define "mysqls" do |mysqls|
mysqls.vm.box = "bento/almalinux-9"
mysqls.vm.hostname = "mysqls"
mysqls.vm.network :private_network, ip: "192.168.99.15"
mysqls.vm.synced_folder "/vagrant/share", "/home/vagrant/share"
mysqls.vm.provider "virtualbox" do |v|
# 设置虚拟机的名称
v.name = "mysqls"
# 设置虚拟机的内存大小
v.memory = 1024
# 设置虚拟机的CPU个数
v.cpus = 1
end
mysqls.vm.provision "shell", path: "installmysql.sh", preserve_order: true, privileged: false
end
(2..3).each do |i|
config.vm.define "node0#{i}" do |node|
# 设置虚拟机的Box
node.vm.box = "bento/almalinux-9"
# 设置虚拟机的主机名
node.vm.hostname="node0#{i}"
# 设置虚拟机的IP
node.vm.network "private_network", ip: "192.168.99.1#{i}"
# 设置主机与虚拟机的共享目录
node.vm.synced_folder "/vagrant/share", "/home/vagrant/share"
# VirtaulBox相关配置
node.vm.provider "virtualbox" do |v|
# 设置虚拟机的名称
v.name = "node0#{i}"
# 设置虚拟机的内存大小
v.memory = 1024
# 设置虚拟机的CPU个数
v.cpus = 1
end
node.vm.provision "shell", path: "clusterinit.sh", preserve_order: true, privileged: false
end
end
config.vm.define "master", primary: true do |master|
master.vm.box = "bento/almalinux-9"
master.vm.hostname = "master"
master.vm.network :private_network, ip: "192.168.99.11"
master.vm.synced_folder "/vagrant/share", "/home/vagrant/share"
master.vm.provider "virtualbox" do |v|
# 设置虚拟机的名称
v.name = "master"
# 设置虚拟机的内存大小
v.memory = 2048
# 设置虚拟机的CPU个数
v.cpus = 1
end
master.vm.provision "shell", path: "clusterinit.sh", preserve_order: true, privileged: false
end
#provision
config.vm.provision "shell", path: "entrypoint.sh", privileged: false
config.trigger.after :up do |trigger|
trigger.name = "trigger"
trigger.info = "Running a trigger after VMs up!"
trigger.only_on = $lasthost
trigger.run_remote = {path: "triggerrun.sh"}
end
end
这里,我们采用Almalinix 9作为虚拟机基础操作系统。通过Shell provision entrypoint.sh设置软件库的Ali镜像和集群各虚拟主机的基本应用安装。
#!/bin/bash
set -e
# Almalinux Ali repo
echo "===========>Tranfer to Ali repo..."
sudo sed -e 's|^mirrorlist=|#mirrorlist=|g' \
-e 's|^# baseurl=https://repo.almalinux.org|baseurl=https://mirrors.aliyun.com|g' \
-i.bak \
/etc/yum.repos.d/almalinux*.repo
sudo dnf makecache
sudo dnf install -y vim
sudo dnf install -y expect
通过Shell provision installmysql.sh实现mysql数据库服务器的安装与基础配置。
#!/bin/bash
set -e
echo "==============>Install mysql db..."
cp /vagrant/hosts.txt /home/vagrant/
sudo chown -R vagrant:vagrant /home/vagrant/hosts.txt
sudo sed -i '/127.0.1.1/d' /etc/hosts
cat /home/vagrant/hosts.txt | sudo tee -a /etc/hosts
sudo dnf install -y mariadb-server mariadb
sudo systemctl start mariadb
sudo systemctl enable mariadb
echo "-------------------------------"
mysql --version
echo "-------------------------------"
mysqlrootpasswd=123456
mysqlsecureinstall(){
echo "======>Mysql secure installation..."
/usr/bin/expect <<EOF
spawn sudo mysql_secure_installation
expect {
"Enter current password" { send "\r"; exp_continue }
"Switch to unix_socket authentication" { send "Y\r"; exp_continue }
"Change the root password" { send "Y\r"; exp_continue }
#"Y/n" { send "Y\r"; exp_continue }
"New password" { send "$mysqlrootpasswd\r"; exp_continue }
"Re-enter new password" { send "$mysqlrootpasswd\r"; exp_continue }
"Remove anonymous users" { send "Y\r"; exp_continue }
"Disallow root login remotely" { send "n\r"; exp_continue }
"Remove test database and access to it" { send "n\r"; exp_continue }
"Reload privilege tables now" { send "Y\r" ; exp_continue }
}
EOF
}
rootremote(){
echo "=====>Grant root remote..."
/usr/bin/expect <<EOF
spawn mysql -u root -p
expect {
"Enter password:" { send "$mysqlrootpasswd\r"; exp_continue; }
}
expect "]>" { send "GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY '123456';\r"; }
expect "]>" { send "FLUSH PRIVILEGES;\r"; }
expect "]>" { send "EXIT\r"; exp_continue; }
EOF
}
set timeout 30
mysqlsecureinstall
rootremote
在shell脚本中,通过expect实现mysql数据库初始化和root远程授权配置的自动化交互。在我们的自动化管理中大量使用了expect实现脚本的自动化交互。
通过Shell provision clusterinit.sh实现大数据集群的SSH密匙生成、JDK安装等基础配置。
#!/bin/bash
set -e
sudo dnf install -y java-1.8.0-openjdk-devel
echo "-------------------------------"
java -version
echo "-------------------------------"
sudo sed -i 's/^#\(PubkeyAuthentication yes\)/\1/' /etc/ssh/sshd_config
sudo sed -i 's/^#\(PermitRootLogin\) .*/\1 yes/' /etc/ssh/sshd_config
echo "StrictHostKeyChecking no" | sudo tee -a /etc/ssh/ssh_config
ssh-keygen -t rsa -f /home/vagrant/.ssh/id_rsa -P ''
#sudo chown -R vagrant:vagrant /home/vagrant/.ssh
cp /vagrant/sshcpid.sh /home/vagrant/
sudo chown vagrant:vagrant /home/vagrant/sshcpid.sh
cp /vagrant/hosts.txt /home/vagrant/
sudo chown vagrant:vagrant /home/vagrant/hosts.txt
sudo sed -i '/127.0.1.1/d' /etc/hosts
cat /home/vagrant/hosts.txt | sudo tee -a /etc/hosts
# javahome=$(ls -l /etc/alternatives/java |cut -d ' ' -f 11 |sed -e 's/^\(.*\)\/jre\/bin\/java/\1/')
# echo "export JAVA_HOME=$javahome" | tee -a /home/vagrant/.bashrc
echo "export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk" | tee -a /home/vagrant/.bashrc
echo "export PATH=\$JAVA_HOME/bin:\$PATH" | tee -a /home/vagrant/.bashrc
echo "export CLASSPATH=.:\$JAVA_HOME/lib/dt.jar:\$JAVA_HOME/lib/tools.jar" | tee -a /home/vagrant/.bashrc
#source .bashrc
最后Vagrant trigger的triggerrun.sh脚本,配合hosts.txt、sshcpid.sh,实现集群免密登录。
triggerrun.sh
#!/bin/bash
set -e
# 取得集群的所有主机名,这里需要注意:hosts配置的IP和主机名只能用一个空格分割
# tail -n +2 表示从第2行开始
# cut -d ' ' -f 2 表示用空格分割每一行并获取第2列
if [ ! -e /home/vagrant/clusterhosts.txt ]; then
cp /home/vagrant/hosts.txt /home/vagrant/clusterhosts.txt
sed -i '/mysqls/d' /home/vagrant/clusterhosts.txt
fi
hostList=$(cat /home/vagrant/clusterhosts.txt | tail -n +1 | cut -d ' ' -f 2)
#服务器和命令设置
user=vagrant
passwd=vagrant
cmd=/home/vagrant/sshcpid.sh
#ssh登录函数
sshset(){
echo "======>$host ssh in $host..."
/usr/bin/expect <<EOF
spawn ssh $user@$host $cmd
expect {
# "*yes/no*" { send "yes\r" ; exp_continue }
"*password*" { send "$passwd\r" ; exp_continue }
}
EOF
}
set timeout 30
for host in $hostList
do
sshset
done
sshcpid.sh
#!/bin/bash
set -e
# 取得集群的所有主机名,这里需要注意:hosts配置的IP和主机名只能用一个空格分割
# tail -n +2 表示从第2行开始
# cut -d ' ' -f 2 表示用空格分割每一行并获取第2列
if [ ! -e /home/vagrant/clusterhosts.txt ]; then
cp /home/vagrant/hosts.txt /home/vagra