[iyunv@node2 corosync]# corosync-keygen Corosync Cluster Engine Authentication key generator. Gathering 1024 bits for key from /dev/random. Press keys on your keyboard to generate entropy. Press keys on your keyboard to generate entropy (bits = 320). Press keys on your keyboard to generate entropy (bits = 384). Press keys on your keyboard to generate entropy (bits = 448). Press keys on your keyboard to generate entropy (bits = 616). Press keys on your keyboard to generate entropy (bits = 680). Press keys on your keyboard to generate entropy (bits = 752). Press keys on your keyboard to generate entropy (bits = 816). Press keys on your keyboard to generate entropy (bits = 936). Press keys on your keyboard to generate entropy (bits = 1000). Writing corosync key to /etc/corosync/authkey. # 此处代表生成成功
[iyunv@node2 corosync]# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid # corosync默认启用了stonith,而当前集群并没有相应的stonith设备,因此此默认配置目前尚不可用 # 即没有 STONITH 设备,此处实验性目的可以忽略;
,这可以通过如下命令验证: 注:Stonith 即shoot the other node in the head使Heartbeat软件包的一部分,该组件允许系统自动复位一个失败的服务器使用连接到一个健康的服务器的遥远电源设备,简单的说Stonith设备可以接受一台主机发来的信号从而切断不能传递心跳信息的节点电源,从而避免产生资源争用的设备; 此时我们将node2 节点停掉,因为node2没办法传递心跳信息,node3以为node2出了故障,马上就变成了DC 而且两个节点都不具备法定票数(partition WITHOUT quorum),再将node2启动起来,就都具有法定票数 (partition quorum);
安装crmsh软件包: What 是 crmsh? pacemaker本身只是一个资源管理器,我们需要一个接口才能对pacemker上的资源进行定义与管理,而crmsh即是pacemaker的配置接口,从pacemaker 1.1.8开始,crmsh 发展成一个独立项目,pacemaker中不再提供。crmsh提供了一个命令行的交互接口来对Pacemaker集群进行管理,它具有更强大的管理功能,同样也更加易用,在更多的集群上都得到了广泛的应用,类似软件还有 pcs; 注:在crm管理接口所做的配置会同步到各个节点上;
Centos 6官方并没有提供crmsh软件包: corosync 2.x及crmsh for centos 6下载地址:
[iyunv@essun corosync]# crm crm(live)# help # 获取当前可用命令 # 一级子命令 This is crm shell, a Pacemaker command line interface. Available commands: cib manage shadow CIBs # cib沙盒 resource resources management # 所有的资源都在这个子命令后定义 configure CRM cluster configuration # 编辑集群配置信息 node nodes management # 集群节点管理子命令 options user preferences # 用户优先级 history CRM cluster history site Geo-cluster support ra resource agents information center # 资源代理子命令(所有与资源代理相关的程都在此命令之下) status show cluster status # 显示当前集群的状态信息 help,? show help (help topics for list of topics)# 查看当前区域可能的命令 end,cd,up go back one level # 返回第一级crm(live)# quit,bye,exit exit the program # 退出crm(live)交互模式
crm(live)resource# help vailable commands: status show status of resources # 显示资源状态信息 start start a resource # 启动一个资源 stop stop a resource # 停止一个资源 restart restart a resource # 重启一个资源 promote promote a master-slave resource # 提升一个主从资源 demote demote a master-slave resource # 降级一个主从资源 manage put a resource into managed mode unmanage put a resource into unmanaged mode migrate migrate a resource to another node # 将资源迁移到另一个节点上 unmigrate unmigrate a resource to another node param manage a parameter of a resource # 管理资源的参数 secret manage sensitive parameters # 管理敏感参数 meta manage a meta attribute # 管理源属性 utilization manage a utilization attribute failcount manage failcounts # 管理失效计数器 cleanup cleanup resource status # 清理资源状态 refresh refresh CIB from the LRM status # 从LRM(LRM本地资源管理)更新CIB(集群信息库),在 reprobe probe for resources not started by the CRM # 探测在CRM中没有启动的资源 trace start RA tracing # 启用资源代理(RA)追踪 untrace stop RA tracing # 禁用资源代理(RA)追踪 help show help (help topics for list of topics) # 显示帮助 end go back one level # 返回一级(crm(live)#) quit exit the program # 退出交互式程序
crm(live)configure# help Available commands: node define a cluster node # 定义一个集群节点 primitive define a resource # 定义资源 monitor add monitor operation to a primitive # 对一个资源添加监控选项(如超时时间,启动失败后的操作) group define a group # 定义一个组类型(将多个资源整合在一起) clone define a clone # 定义一个克隆类型(可以设置总的克隆数,每一个节点上可以运行几个克隆) ms define a master-slave resource # 定义一个主从类型(集群内的节点只能有一个运行主资源,其它从的做备用) rsc_template define a resource template # 定义一个资源模板 location a location preference # 定义位置约束优先级(默认运行于那一个节点(如果位置约束的值相同,默认倾向性那一个高,就在那一个节点上运行)) colocation colocate resources # 排列约束资源(多个资源在一起的可能性) order order resources # 资源的启动的先后顺序 rsc_ticket resources ticket dependency property set a cluster property # 设置集群属性 rsc_defaults set resource defaults # 设置资源默认属性(粘性) fencing_topology node fencing order # 隔离节点顺序 role define role access rights # 定义角色的访问权限 user define user access rights # 定义用用户访问权限 op_defaults set resource operations defaults # 设置资源默认选项 schema set or display current CIB RNG schema show display CIB objects # 显示集群信息库对 edit edit CIB objects # 编辑集群信息库对象(vim模式下编辑) filter filter CIB objects # 过滤CIB对象 delete delete CIB objects # 删除CIB对象 default-timeouts set timeouts for operations to minimums from the meta-data rename rename a CIB object # 重命名CIB对象 modgroup modify group # 改变资源组 refresh refresh from CIB # 重新读取CIB信息 erase erase the CIB # 清除CIB信息 ptest show cluster actions if changes were committed rsctest test resources as currently configured cib CIB shadow management cibstatus CIB status management and editing template edit and import a configuration from a template commit commit the changes to the CIB # 将更改后的信息提交写入CIB verify verify the CIB with crm_verify # CIB语法验证 upgrade upgrade the CIB to version 1.0 save save the CIB to a file # 将当前CIB导出到一个文件中(导出的文件存于切换crm 之前的目录) load import the CIB from a file # 从文件内容载入CIB graph generate a directed graph xml raw xml help show help (help topics for list of topics) # 显示帮助信息 end go back one level # 回到第一级(crm(live)#)
node子命令 # 节点管理和状态
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
crm(live)# node crm(live)node# help Node management and status commands. Available commands: status show nodes status as XML # 以xml格式显示节点状态信息 show show node # 命令行格式显示节点状态信息 standby put node into standby # 模拟指定节点离线(standby在后面必须的FQDN) online set node online # 节点重新上线 maintenance put node into maintenance mode ready put node into ready mode fence fence node # 隔离节点 clearstate Clear node state # 清理节点状态信息 delete delete node # 删除 一个节点 attribute manage attributes utilization manage utilization attributes status-attr manage status attributes help show help (help topics for list of topics) end go back one level quit exit the program
ra子命令 # 资源代理类别都在此处
1 2 3 4 5 6 7 8 9 10
crm(live)# ra crm(live)ra# help Available commands: classes list classes and providers # 为资源代理分类 list list RA for a class (and provider)# 显示一个类别中的提供的资源 meta show meta data for a RA # 显示一个资源代理序的可用参数(如meta ocf:heartbeat:IPaddr2) providers show providers for a RA and a class help show help (help topics for list of topics) end go back one level quit exit the program
crm(live)# node crm(live)node# standby crm(live)node# cd .. crm(live)# status Last updated: Sat Jan 3 18:47:37 2015 Last change: Sat Jan 3 18:46:17 2015 Stack: classic openais (with plugin) Current DC: node3.zhangjian.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 1 Resources configured Node node2.zhangjian.com: standby # 当前节点是 standby状态
# 将其中一个节点停止,资源就会消失而不是转移到另一个节点上,因为当前是两节点的集群,任何一个节点损坏,其它节点就没办法进行投票,status 中就会变成 WITHOUT quorum,而此时要解决这个问题有两种办法: 1、配置一个仲裁节点;
2、当不具备法定票数时忽略;
注意:忽略法定票数,可能会导致集群的分裂,在生产环境中不建议使用;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
[iyunv@node2 ~]# crm crm(live)# configure crm(live)configure# property no-quorum-policy=ignore crm(live)configure# commit crm(live)configure# cd .. crm(live)# status Last updated: Sat Jan 3 20:51:19 2015 Last change: Sat Jan 3 20:51:08 2015 Stack: classic openais (with plugin) Current DC: node2.zhangjian.com - partition WITHOUT quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 1 Resources configured
[iyunv@node2 html]# crm crm(live)# configure crm(live)configure# primitive webserver lsb:httpd op monitor interval=30s timeout=15s crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sat Jan 3 21:25:49 2015 Last change: Sat Jan 3 21:25:45 2015 Stack: classic openais (with plugin) Current DC: node2.zhangjian.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 2 Resources configured
crm(live)# configure crm(live)configure# colocation webserver_with_webip inf: webserver webip # 定义在一起 crm(live)configure# show # 查看刚刚定义是否生效 crm(live)configure# commit crm(live)configure# cd .. crm(live)# status
Last updated: Sat Jan 3 21:30:14 2015 Last change: Sat Jan 3 21:30:08 2015 Stack: classic openais (with plugin) Current DC: node2.zhangjian.com - partition with quorum Version: 1.1.11-97629de 2 Nodes configured, 2 expected votes 2 Resources configured