一、Puppet  master/agent模型

puppet通过在master端启动puppetmaster服务来接受agent客户端的请,在/etc/puppet/manifest/site.pp中通过客户端的FQDN来定义每个agent所有应用的class,首次启动puppet守护进程时,其会自动进行运行环境的初始化,创建一个本地CA及服务器端相关的证书和密钥等。初始化操作完成后,puppet就会监听指定的套接字并等待客户端的连接请求。默认情况下,其证书和密钥等文件位于/var/lib/puppet/ssl/目录中。


MCollective简介

MCollective是一个调度器,可以解决多个puppet agent同时向master提出请求造成性能,速度下降的问题,它可以根据不同的属性对节点进行分类,对不同的分类执行不同的任务;它是一个控制终端,可以使用它控制客户端和服务器,因此不需要puppet agent定时运行了。

MCollective也是C/S架构,而且client和server使用Midware(中间件)进行通信


Puppet架构与集群

Puppet通常部署为C/S架构,当agent过多时会面临性能问题

常见的集群方案:

puppet + nginx

puppet + passenger + apache

Puppet集群的构建机制

puppetmaster集群:

Active/Active模式高可用集群,分摊puppetmaster上来自于agent的请求压力

反向代理模式,将针对于8140端口请求分散到多台puppetmaster


下面是master/agent模型的原理图:

wKiom1NLUozTRlYgAAG4a7z9xTg354.jpg



二、实验环境

192.168.30.116  OS:CentOS 6.4 x86_64 node1.luojianlong.com

192.168.30.117  OS:CentOS 6.4 x86_64 node2.luojianlong.com

192.168.30.119  OS:CentOS 6.4 x86_64 node3.luojianlong.com


需要的软件包:

puppet-2.7.23-1.el6.noarch.rpm

puppet-server-2.7.23-1.el6.noarch.rpm

facter-1.7.3-1.el6.x86_64.rpm

puppet-dashboard-1.2.23-1.el6.noarch.rpm

mysql-5.5.33-linux2.6-x86_64.tar.gz

wKioL1NLU_Pijiq8AADrfQoe5Ms157.jpg



首先在node1安装master端

#设置各节点的hosts文件
[root@node1 ~]# cat /etc/hosts
192.168.30.116 node1.luojianlong.com
192.168.30.117 node2.luojianlong.com
192.168.30.119 node3.luojianlong.com
# 更新facter
[root@node1 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm
# 配置epel源
[root@node1 ~]# cat /etc/yum.repos.d/epel.repo
[epel]
name=epel
baseurl=http://mirrors.sohu.com/fedora-epel/6/$basearch/
gpgcheck=1
gpgkey=http://mirrors.sohu.com/fedora-epel/RPM-GPG-KEY-EPEL-6
# 安装puppet,puppet-server
[root@node1 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm
[root@node1 ~]# yum -y localinstall puppet-server-2.7.23-1.el6.noarch.rpm

在node2,node3安装puppet-agent

[root@node2 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm
[root@node2 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm
[root@node3 ~]# rpm -Uvh facter-1.7.3-1.el6.x86_64.rpm
[root@node3 ~]# yum -y localinstall puppet-2.7.23-1.el6.noarch.rpm


在node1创建并配置模块

[root@node1 ~]# mkdir -pv /etc/puppet/modules/nginx/{manifests,files,lib,templates,tests,spec}
mkdir: created directory `/etc/puppet/modules/nginx'
mkdir: created directory `/etc/puppet/modules/nginx/manifests'
mkdir: created directory `/etc/puppet/modules/nginx/files'
mkdir: created directory `/etc/puppet/modules/nginx/lib'
mkdir: created directory `/etc/puppet/modules/nginx/templates'
mkdir: created directory `/etc/puppet/modules/nginx/tests'
mkdir: created directory `/etc/puppet/modules/nginx/spec'
[root@node1 ~]# puppet module list
/etc/puppet/modules
└── nginx (???)
/usr/share/puppet/modules (no modules installed)
# 在nginx模块中定义init.pp
[root@node1 ~]# vi /etc/puppet/modules/nginx/manifests/init.pp
class nginx {
        package {'nginx':
              ensure => installed,
        }
}
# 定义nginx_web.pp文件
[root@node1 ~]# vi /etc/puppet/modules/nginx/manifests/nginx_web.pp
class nginx::nginx_web inherits nginx {
        file {'/etc/nginx/nginx.conf':
            ensure => file,
            source => 'puppet:///modules/nginx/nginx-web.conf',
            mode => '0644',
            owner => 'root',
            group => 'root',
            notify => Service['nginx'],
            require => Package['nginx'],
        }
        service {'nginx':
            ensure => running,
}
}
# 准备source文件
[root@node1 ~]# cp /tmp/nginx.conf /etc/puppet/modules/nginx/files/nginx-web.conf
# 创建site.pp文件调用前面定义的class
[root@node1 ~]# vi /etc/puppet/manifests/site.pp
node 'node2.luojianlong' {
        include nginx::nginx_web
}
node 'node3.luojianlong' {
        include nginx::nginx_web
}


首次启动puppet服务进程可以以非守护进程方式进行,并让其输出详解信息以便于观察初始化过程。如下所示过程,其逐步展示了创建本地CA、作为puppet服务器的本地主机向CA申请证书、获得证书以及CA移出证书签署请求的过程等,而后启动服务进程并准备接受各agent的连接请求。为下面的命令额外使用--debug选项,还可以获得更为详细的输出信息。

[root@node1 ~]# puppet master --verbose --no-daemonize
info: Creating a new SSL key for ca
info: Creating a new SSL certificate request for ca
info: Certificate Request fingerprint (md5): E0:74:ED:BA:83:EC:6E:A7:1A:1F:89:B1:CC:81:C3:CE
notice: Signed certificate request for ca
notice: Rebuilding inventory file
info: Creating a new certificate revocation list
info: Creating a new SSL key for node1.luojianlong.com
info: Creating a new SSL certificate request for node1.luojianlong.com
info: Certificate Request fingerprint (md5): 05:F1:37:DE:6E:13:CA:32:46:5B:07:2A:05:DE:D1:12
notice: node1.luojianlong.com has a waiting certificate request
notice: Signed certificate request for node1.luojianlong.com
notice: Removing file Puppet::SSL::CertificateRequest node1.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node1.luojianlong.com.pem'
notice: Removing file Puppet::SSL::CertificateRequest node1.luojianlong.com at '/var/lib/puppet/ssl/certificate_requests/node1.luojianlong.com.pem'
notice: Starting Puppet master version 2.7.23


注意:如果此前曾以其它主机名或各种原因启动过puppet客户端进程并完成过初始化,其证书文件将无法符合本此启动的需要;此时,需要先清空/var/lib/puppet/ssl/目录方可完成后续的初始化操作。


如果上述的测试启动没有问题,可中止当前的启动后将之启动守护进程了,在CentOS6上,通常会使用如下命令进行。

[root@node1 ~]# service puppetmaster start
Starting puppetmaster:                                     [  OK  ]
[root@node1 ~]# chkconfig puppetmaster on


启动puppet客户端

puppet agent在首次启动时,会向为其指定的puppet server申请证书,并完成后续连接请求。同样地理由,出于测试的目的,接入当前puppet集群中的首个agent节点可以以非守护进程的方式进行,以观察其初始化过程,如下面的命令所示

[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose
info: Creating a new SSL key for node2.luojianlong.com
info: Caching certificate for ca
info: Creating a new SSL certificate request for node2.luojianlong.com
info: Certificate Request fingerprint (md5): 11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED
[root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose
info: Creating a new SSL key for node3.luojianlong.com
info: Caching certificate for ca
info: Creating a new SSL certificate request for node3.luojianlong.com
info: Certificate Request fingerprint (md5): A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14


此时,在puppet服务器端使用puppet cert命令管理客户端的证书请求,其--list选项能够查看等待签署证书的客户端列表,而--sign选项可用于为指定指定节点签署证书,如果要一次性地为多个节点的证书申请进行签署可同时使用--all选项。

[root@node1 ~]# puppet cert --list
  "node2.luojianlong.com" (11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED)
  "node3.luojianlong.com" (A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14)
[root@node1 ~]# puppet cert --sign node2.luojianlong.com
notice: Signed certificate request for node2.luojianlong.com
notice: Removing file Puppet::SSL::CertificateRequest node2.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node2.luojianlong.com.pem'
[root@node1 ~]# puppet cert --sign node3.luojianlong.com
notice: Signed certificate request for node3.luojianlong.com
notice: Removing file Puppet::SSL::CertificateRequest node3.luojianlong.com at '/var/lib/puppet/ssl/ca/requests/node3.luojianlong.com.pem'


一旦agent节点收到签署过的证书时,其就会显示类似如下信息。

[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose
info: Creating a new SSL key for node2.luojianlong.com
info: Caching certificate for ca
info: Creating a new SSL certificate request for node2.luojianlong.com
info: Certificate Request fingerprint (md5): 11:56:36:0D:A5:92:11:69:AC:66:46:1B:86:D9:B4:ED
info: Caching certificate for node2.luojianlong.com
notice: Starting Puppet client version 2.7.23
info: Caching certificate_revocation_list for ca
info: Caching catalog for node2.luojianlong.com
info: Applying configuration version '1389325340'
notice: /Stage[main]/Nginx/Package[nginx]/ensure: created
notice: /Stage[main]/Nginx::Nginx_web/Service[nginx]/ensure: ensure changed 'stopped' to 'running'
info: Creating state file /var/lib/puppet/state/state.yaml
notice: Finished catalog run in 10.22 seconds
[root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose
info: Creating a new SSL key for node3.luojianlong.com
info: Caching certificate for ca
info: Creating a new SSL certificate request for node3.luojianlong.com
info: Certificate Request fingerprint (md5): A3:70:BF:52:F9:11:DA:0F:09:8B:35:C6:FC:EB:87:14
info: Caching certificate for node3.luojianlong.com
notice: Starting Puppet client version 2.7.23
info: Caching certificate_revocation_list for ca
info: Caching catalog for node3.luojianlong.com
info: Applying configuration version '1389325340'
notice: /Stage[main]/Nginx/Package[nginx]/ensure: created
notice: /Stage[main]/Nginx::Nginx_web/Service[nginx]/ensure: ensure changed 'stopped' to 'running'
info: Creating state file /var/lib/puppet/state/state.yaml
notice: Finished catalog run in 17.83 seconds


确保上述agent相关的操作不存在问题后,便可以将--server选项指定的信息存储于agent的配置文件中,并以服务进程的方式启动puppet agent了。其配置文件为/etc/puppet/puppet.conf,server指令定义于[main]段中。配置完成,即可以服务方式启动puppet。

[root@node2 ~]# vi /etc/puppet/puppet.conf
server = node1.luojianlong.com
[root@node3 ~]# vi /etc/puppet/puppet.conf
server = node1.luojianlong.com
[root@node2 ~]# service puppet start
Starting puppet:                                           [  OK  ]
[root@node3 ~]# service puppet start
Starting puppet:                                           [  OK  ]


再次通过客户端测试。

[root@node2 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose --test
info: Caching catalog for node2.luojianlong.com
info: Applying configuration version '1389325340'
notice: Finished catalog run in 0.97 seconds
[root@node3 ~]# puppet agent --server=node1.luojianlong.com --no-daemonize --verbose --test
info: Caching catalog for node3.luojianlong.com
info: Applying configuration version '1389325340'
notice: Finished catalog run in 0.95 seconds

如上的信息显示其已经能正常与master建立连接

查看node2,node3 nginx是否安装并启动

[root@node2 ~]# rpm -q nginx
nginx-1.0.15-5.el6.x86_64
[root@node2 ~]# ps aux | grep nginx
root     19233  0.0  0.0  96432  1968 ?        Ss   12:18   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx    19234  0.0  0.0  96780  2612 ?        S    12:18   0:00 nginx: worker process              
root     19515  0.0  0.0 103248   820 pts/0    S+   12:22   0:00 grep nginx
[root@node3 ~]# rpm -q nginx
nginx-1.0.15-5.el6.x86_64
[root@node3 ~]# ps aux | grep nginx
root      3082  0.0  0.0  96432  1968 ?        Ss   12:18   0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
nginx     3083  0.0  0.0  96780  2612 ?        S    12:18   0:00 nginx: worker process              
root      3242  0.0  0.0 103248   824 pts/0    S+   12:22   0:00 grep nginx


正常安装启动


自动签发证书

可以设置master自动签发所有的证书,我们只需要在/etc/puppet 目录下创建 autosign.conf 文件,修改 /etc/puppet/puppet.conf文件


[root@node1 ~]# cat > /etc/puppet/autosign.conf << EOF
> *.luojianlong.com
> EOF


[root@node1 ~]# vi /etc/puppet/puppet.conf
# 添加[master]
[master]
autosign = /etc/puppet/autosign.conf
[root@node1 ~]# service puppetmaster restart
Stopping puppetmaster:                                     [  OK  ]
Starting puppetmaster:                                     [  OK  ]


这样就会对所有来自luojianlong.com 的机器的请求,都自动签名,puppet每半个小时检查一次更新,如果想修改检查时间,可以修改客户端配置文件/etc/puppet/puppet.conf,在[agent]中添加runinterval的值,然后重启puppet默认为600,单位秒。


在node1上安装配置puppet-dashboard:

[root@node1 ~]# yum -y install rubygem-rake ruby-mysql
[root@node1 ~]# yum localinstall puppet-dashboard-1.2.23-1.el6.noarch.rpm -y
[root@node1 ~]# gem install rake


在node1上安装mysql

[root@node1 ~]# tar zxvf mysql-5.5.33-linux2.6-x86_64.tar.gz -C /usr/local/
[root@node1 ~]# ln -s /usr/local/mysql-5.5.33-linux2.6-x86_64 /usr/local/mysql
[root@node1 ~]# cd /usr/local/mysql
[root@node1 mysql]# useradd -r mysql
[root@node1 mysql]# mkdir /mydata/data -p
[root@node1 mysql]# chown -R root.mysql ./*
[root@node1 mysql]# chown -R mysql.mysql /mydata/data/
[root@node1 mysql]# cp support-files/mysql.server /etc/rc.d/init.d/mysqld
[root@node1 mysql]# chkconfig --add mysqld
[root@node1 mysql]# chkconfig mysqld on
[root@node1 mysql]# cp support-files/my-large.cnf /etc/my.cnf
[root@node1 mysql]# ./scripts/mysql_install_db --user=mysql --datadir=/mydata/data
[root@node1 mysql]# vi /etc/profile.d/mysql.sh
export PATH=/usr/local/mysql/bin:$PATH
[root@node1 mysql]# . /etc/profile.d/mysql.sh
[root@node1 mysql]# vi /etc/my.cnf
datadir = /mydata/data
innodb_file_per_table = 1
[root@node1 mysql]# service mysqld start
Starting MySQL..... SUCCESS!


创建数据库并完成授权

mysql> create database dashboard character set utf8;
Query OK, 1 row affected (0.00 sec)
mysql> grant all privileges on dashboard.* to 'dashboard'@'localhost' identified by '123456';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)


修改/usr/share/puppet-dashboard/config/database.yml中的production段。

[root@node1 ~]# vi /usr/share/puppet-dashboard/config/database.yml
production:
  host: 127.0.0.1
  database: dashboard
  username: dashboard
  password: 123456
  encoding: utf8
  adapter: mysql
[root@node1 ~]# cd /usr/share/puppet-dashboard/
[root@node1 puppet-dashboard]# rake gems:refresh_specs
# 为dashboard依赖的数据库导入所需要的表:
[root@node1 puppet-dashboard]# rake RAILS_ENV=production db:migrate


测试服务器是否能正常工作

[root@node1 ~]# /usr/share/puppet-dashboard/script/server -e production
=> Booting WEBrick
=> Rails 2.3.17 application starting on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
[2014-01-10 12:37:34] INFO  WEBrick 1.3.1
[2014-01-10 12:37:34] INFO  ruby 1.8.7 (2011-06-30) [x86_64-linux]
[2014-01-10 12:37:34] INFO  WEBrick::HTTPServer#start: pid=20641 port=3000


打开浏览器访问http://192.168.30.116:3000

wKiom1NLaFbxFJTJAAGmfJmhvx0564.jpg



配置puppet服务端和客户端

[root@node1 ~]# vi /etc/puppet/puppet.conf
#在[master]段中添加
reports = store, http
reporturl = http://192.168.30.116:3000/reports/upload
[root@node1 ~]# service puppetmaster restart
Stopping puppetmaster:                                     [  OK  ]
Starting puppetmaster:                                     [  OK  ]
[root@node2 ~]# vi /etc/puppet/puppet.conf
# 在[agent]段中添加
report = true
[root@node2 ~]# service puppet restart
Stopping puppet:                                           [  OK  ]
Starting puppet:                                           [  OK  ]
# node3也一样,添加并重启puppet


然后启动dashboard

[root@node1 ~]# /usr/share/puppet-dashboard/script/server -e production -d


打开浏览器访问http://192.168.30.116:3000/

wKioL1NLe3_DCQ0_AAHhDS_Y99s840.jpg


看到“# pending task”类的信息,且数字大于0,则表示已经正常接收报告了,一旦有用户任务延迟就会记录在dashboard中。


puppet kick 功能实现

puppet客户端默认每30分钟跟服务器通讯一次,但是有时,我们希望服务端能给客户端紧急推送一些任务,于是就有了puppet kick(puppet 2.6以前叫puppetrun)。

编辑客户端/etc/puppet/puppet.conf

[root@node2 ~]# vi /etc/puppet/puppet.conf
# 在[agent]段中添加
listen = true
# 编辑或新建文件/etc/puppet/namespaceauth.conf
[root@node2 ~]# vi /etc/puppet/namespaceauth.conf
[puppetrunner]
allow *.luojianlong.com


编辑文件auth.conf

[root@node2 ~]# vi /etc/puppet/auth.conf
# 添加如下几行
path /run
method save
allow node1.luojianlong.com
[root@node2 ~]# service puppet restart
Stopping puppet:                                           [  OK  ]
Starting puppet:                                           [  OK  ]
[root@node2 ~]# netstat -anptl | grep ruby
tcp        0      0 0.0.0.0:8139                0.0.0.0:*                   LISTEN      27053/ruby


node3做上述一样的操作


在服务端运行命令

[root@node1 ~]# puppet kick -a --host=node2.luojianlong.com
Triggering node2.luojianlong.com
Getting status
status is success
node2.luojianlong.com finished with exit code 0
Finished
[root@node1 ~]# puppet kick -a --host=node3.luojianlong.com
Triggering node3.luojianlong.com
Getting status
status is success
node3.luojianlong.com finished with exit code 0
Finished


发现可以正常推送