sfrac16 c语言,SFRAC安装错误(三) stale状态导致f,v.w不能启动

今天测试新build时,配完重起后发现vcs只启动5个端口。

root@lxsfrac04 # gabconfig -a

GABPortMemberships

===============================================================

Port a gen70a501 membership 01

Port b gen 70a507 membership 01

Port d gen70a503 membership 01

Port h gen70a506 membership 01

Port o gen70a509 membership 01

情况是运行tc时有个步骤修改vcs的配置文件时系统做了个haconf-makerw操作后导致的。以前遇到过这种问题,一般来说f,v,w未启动均与vcs有关。

先察看一下日志vcs日志。

Lxsfrac04# tail –f /var/VRTSvcs/log/engine_A.log

…………………………………………………………………………………………….

2008/06/02 10:29:43 VCS NOTICE V-16-1-10114 Opening GAB library

2008/06/02 10:29:43 VCS NOTICE V-16-1-10619 'HAD' starting on: lxsfrac04

2008/06/02 10:29:43 VCS ERROR V-16-1-10624 Local cluster configuration stale

2008/06/02 10:29:43 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms

2008/06/02 10:29:47 VCS INFO V-16-1-10077 Received new cluster membership

2008/06/02 10:29:47 VCS NOTICE V-16-1-10080 System (lxsfrac04) - Membership: 0x3, Jeopardy: 0x0

2008/06/02 10:29:47 VCS NOTICE V-16-1-10322 System(Node '1') changed state from UNKNOWN to INITING

2008/06/02 10:29:47 VCS NOTICE V-16-1-10086 System lxsfrac04 (Node '0') is in Regular Membership - Membership: 0x3

2008/06/02 10:29:47 VCS NOTICE V-16-1-10086 System(Node '1') is in Regular Membership - Membership: 0x3

2008/06/02 10:29:47 VCS NOTICE V-16-1-10453 Node: 1 changed name from: '' to: 'lxsfrac03'

2008/06/02 10:29:47 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from INITING to STALE_ADMIN_WAIT

2008/06/02 10:29:47 VCS NOTICE V-16-1-10322 System lxsfrac04 (Node '0') changed state from STALE_DISCOVER_WAIT to STALE_ADMIN_WAIT

2008/06/02 10:37:01 VCS NOTICE V-16-1-11022 VCS engine (had) started

2008/06/02 10:37:01 VCS NOTICE V-16-1-11050 VCS engine version=4.1

2008/06/02 10:37:01 VCS NOTICE V-16-1-11051 VCS engine join version=4.1001

2008/06/02 10:37:01 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 03/15/06-20:13:00

2008/06/02 10:37:01 VCS NOTICE V-16-1-10114 Opening GAB library

2008/06/02 10:37:04 VCS NOTICE V-16-1-10619 'HAD' starting on: lxsfrac04

2008/06/02 10:37:06 VCS ERROR V-16-1-10624 Local cluster configuration stale

2008/06/02 10:37:06 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms

2008/06/02 10:37:10 VCS INFO V-16-1-10077 Received new cluster membership

2008/06/02 10:37:10 VCS NOTICE V-16-1-10080 System (lxsfrac04) - Membership: 0x1, Jeopardy: 0x2

2008/06/02 10:37:10 VCS NOTICE V-16-1-10086 System lxsfrac04 (Node '0') is in Regular Membership - Membership: 0x1

2008/06/02 10:37:10 VCS NOTICE V-16-1-10322 System lxsfrac04 (Node '0') changed state from STALE_DISCOVER_WAIT to STALE_ADMIN_WAIT

2008/06/02 10:37:20 VCS INFO V-16-1-10077 Received new cluster membership

2008/06/02 10:37:20 VCS NOTICE V-16-1-10080 System (lxsfrac04) - Membership: 0x3, Jeopardy: 0x0

2008/06/02 10:37:20 VCS NOTICE V-16-1-10322 System(Node '1') changed state from UNKNOWN to INITING

2008/06/02 10:37:20 VCS NOTICE V-16-1-10086 System(Node '1') is in Regular Membership - Membership: 0x3

2008/06/02 10:37:20 VCS NOTICE V-16-1-10453 Node: 1 changed name from: '' to: 'lxsfrac03'

2008/06/02 10:37:20 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from INITING to STALE_DISCOVER_WAIT

2008/06/02 10:37:20 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from STALE_DISCOVER_WAIT to STALE_ADMIN_WAIT

2008/06/02 10:53:38 VCS ERROR V-16-1-10069 All systems have configuration files marked STALE.Unable to form cluster.

2008/06/02 10:53:38 VCS INFO V-16-1-50135 User root fired command: MSG_CLUSTER_STOP_SYS from localhost

2008/06/02 10:53:38 VCS NOTICE V-16-1-10322 System lxsfrac04 (Node '0') changed state from STALE_ADMIN_WAIT to EXITED

2008/06/02 10:54:49 VCS NOTICE V-16-1-11022 VCS engine (had) started

2008/06/02 10:54:49 VCS NOTICE V-16-1-11050 VCS engine version=4.1

2008/06/02 10:54:49 VCS NOTICE V-16-1-11051 VCS engine join version=4.1001

2008/06/02 10:54:49 VCS NOTICE V-16-1-11052 VCS engine pstamp=4.1 03/15/06-20:13:00

2008/06/02 10:54:49 VCS NOTICE V-16-1-10114 Opening GAB library

2008/06/02 10:54:49 VCS NOTICE V-16-1-10619 'HAD' starting on: lxsfrac04

2008/06/02 10:54:49 VCS INFO V-16-1-10125 GAB timeout set to 15000 ms

2008/06/02 10:54:54 VCS INFO V-16-1-10077 Received new cluster membership

2008/06/02 10:54:54 VCS NOTICE V-16-1-10080 System (lxsfrac04) - Membership: 0x3, Jeopardy: 0x0

2008/06/02 10:54:54 VCS NOTICE V-16-1-10322 System(Node '1') changed state from UNKNOWN to INITING

2008/06/02 10:54:54 VCS NOTICE V-16-1-10086 System lxsfrac04 (Node '0') is in Regular Membership - Membership: 0x3

2008/06/02 10:54:54 VCS NOTICE V-16-1-10086 System(Node '1') is in Regular Membership - Membership: 0x3

2008/06/02 10:54:54 VCS NOTICE V-16-1-10453 Node: 1 changed name from: '' to: 'lxsfrac03'

2008/06/02 10:54:54 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from INITING to STALE_ADMIN_WAIT

2008/06/02 10:54:54 VCS NOTICE V-16-1-10322 System lxsfrac04 (Node '0') changed state from CURRENT_DISCOVER_WAIT to LOCAL_BUILD

2008/06/02 10:54:54 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from STALE_ADMIN_WAIT to STALE_PEER_WAIT

2008/06/02 10:54:55 VCS WARNING V-16-1-10030 UseFence=NONE. Hence do not need fencing

2008/06/02 10:54:55 VCS NOTICE V-16-1-10322 System lxsfrac04 (Node '0') changed state from LOCAL_BUILD to RUNNING

2008/06/02 10:54:55 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from STALE_PEER_WAIT to REMO TE_BUILD

2008/06/02 10:54:55 VCS NOTICE V-16-1-10016 Agent/opt/VRTSvcs/bin/CFSfsckd/CFSfsckdAgent for resource type CFSfsc

kd successfully started at Mon Jun2 10:54:55 2008

2008/06/02 10:54:55 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CVMCluster/CVMClusterAgent for resource type CV

MCluster successfully started at Mon Jun2 10:54:55 2008

2008/06/02 10:54:55 VCS NOTICE V-16-1-10016 Agent /opt/VRTSvcs/bin/CVMVxconfigd/CVMVxconfigdAgent for resource typ

e CVMVxconfigd successfully started at Mon Jun2 10:54:55 2008

2008/06/02 10:54:55 VCS INFO V-16-1-10463 Sending snapshot to node: 1

2008/06/02 10:54:55 VCS NOTICE V-16-1-10322 System lxsfrac03 (Node '1') changed state from REMOTE_BUILD to RUNNING

2008/06/02 10:54:55 VCS ERROR V-16-10001-1005 (lxsfrac04) CVMCluster:???:monitor:node - state: out of cluster

2008/06/02 10:54:56 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on lxsfrac04 (First probe)

2008/06/02 10:54:56 VCS INFO V-16-1-10304 Resource vxfsckd (Owner: unknown, Group: cvm) is offline on lxsfrac04 (First probe)

2008/06/02 10:54:56 VCS INFO V-16-1-10297 Resource cvm_vxconfigd (Owner: unknown, Group: cvm) is online on lxsfrac04 (First probe)

2008/06/02 10:54:56 VCS NOTICE V-16-1-10438 Group cvm has been probed on system lxsfrac04

2008/06/02 10:54:56 VCS NOTICE V-16-1-10442 Initiating auto-start online of group cvm on system lxsfrac04

2008/06/02 10:54:56 VCS NOTICE V-16-1-10301 Initiating Online of Resource cvm_clus (Owner: unknown, Group: cvm) on System lxsfrac04

2008/06/02 10:54:56 VCS ERROR V-16-10001-1005 (lxsfrac03) CVMCluster:???:monitor:node - state: out of cluster

察看一下 vcs状态,

root@lxsfrac04 #hastatus -sum

-- SYSTEM STATE

-- SystemStateFrozen

Alxsfrac03STALE_ADMIN_WAIT0

Alxsfrac04STALE_ADMIN_WAIT0

root@lxsfrac04 #hastatus

attempting to connect....connected

groupresourcesystemmessage

------- --------------- ------------ ----------------------------------------

lxsfrac04STALE ADMIN WAIT: all systems stale

lxsfrac03STALE ADMIN WAIT: all systems stale

^C

此时的状态为stale,赶紧温习一下vcs关于stale的讲解,没看太明白,大概意思是说:vcs运行时会在共享内存上保留一份配置信息,如果当前的main.cf与内存上的配置不一致的时候就会出现stale状态,会生成.stale文件。

先尝试将配置状态转为readonly状态,失败

root@lxsfrac04 # haconf -dump -makero

VCS WARNING V-16-1-50129 Operation 'haconf -dump -makero' rejected as the node is in STALE_ADMIN_WAIT state

停掉vcs

root@lxsfrac04 # hastop –all

删除.stale文件

root@lxsfrac04 # ls -alrt

total 240

………………………………………………………………………………………

-rw-------2 rootroot495 Jun1 23:19 CFSTypes.cf

-rw-------1 rootroot941 Jun1 23:19 main.cf

-rw-------1 rootroot0 Jun2 09:53 .stale

-rw-------1 rootroot373 Jun2 10:03 MultiPrivNIC.cf

-r--r--r--1 rootsys366 Jun2 10:03 PrivNIC.cf_new

-rw-------1 rootroot395 Jun2 10:04 PrivNIC.cf

-rw------- 1 rootroot1013 Jun2 10:28 main.cf_for_privNIC

-rw-------1 rootroot71618 Jun2 10:29 main.cmd

drwxr-xr-x2 rootother1024 Jun2 10:37 .

………………………………………………………………………………………………

root@lxsfrac04 # rm -rf .stale

重起各节点vcs

root@lxsfrac04 # hastart

root@lxsfrac03 # hastart

root@lxsfrac04 # gabconfig -a

GABPortMemberships

===============================================================

Port a gen70a501 membership 01

Port b gen70a507 membership 01

Port d gen70a503 membership 01

Port f gen70a512 membership 01

Port h gen70a508 membership 01

Port o gen70a509 membership 01

Port v gen70a50e membership 01

Port w gen70a510 membership 01

再看vcs日志

Lxsfrac04# tail –f /var/VRTSvcs/log/engine_A.log

2008/06/02 10:54:57 VCS INFO V-16-1-10297 Resource cvm_vxconfigd (Owner: unknown, Group: cvm) is online on lxsfrac03 (First probe)

2008/06/02 10:54:57 VCS INFO V-16-1-10304 Resource vxfsckd (Owner: unknown, Group: cvm) is offline on lxsfrac03 (First probe)

2008/06/02 10:54:57 VCS INFO V-16-1-10304 Resource cvm_clus (Owner: unknown, Group: cvm) is offline on lxsfrac03 (First probe)

2008/06/02 10:54:57 VCS NOTICE V-16-1-10438 Group cvm has been probed on system lxsfrac03

2008/06/02 10:54:57 VCS NOTICE V-16-1-10442 Initiating auto-start online of group cvm on system lxsfrac03

2008/06/02 10:54:57 VCS NOTICE V-16-1-10301 Initiating Online of Resource cvm_clus (Owner: unknown, Group: cvm) on System lxsfrac03

2008/06/02 10:55:15 VCS INFO V-16-10001-1003 (lxsfrac03) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active - MASTER

master: lxsfrac03

2008/06/02 10:55:17 VCS INFO V-16-1-10298 Resource cvm_clus (Owner: unknown, Group: cvm) is online on lxsfrac03 (VCS initiated)

2008/06/02 10:55:17 VCS NOTICE V-16-1-10301 Initiating Online of Resource vxfsckd (Owner: unknown, Group: cvm) on System lxsfrac03

2008/06/02 10:55:19 VCS INFO V-16-1-10298 Resource vxfsckd (Owner: unknown, Group: cvm) is online on lxsfrac03 (VCS initiated)

2008/06/02 10:55:19 VCS NOTICE V-16-1-10447 Group cvm is online on system lxsfrac03

2008/06/02 10:55:19 VCS INFO V-16-10001-15051 (lxsfrac03) triggers:???:nfs_restart:Trigger does not do anything as there is no NFS/NFSLock/Share resource in the group

2008/06/02 10:55:19 VCS INFO V-16-6-15002 (lxsfrac03) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart cvmsuccessfully

2008/06/02 10:55:19 VCS INFO V-16-6-15004 (lxsfrac03) hatrigger:Failed to send trigger for postonline; script doesn't exist

2008/06/02 10:55:35 VCS INFO V-16-10001-1003 (lxsfrac04) CVMCluster:cvm_clus:online:CVMCluster role is - mode: enabled: cluster active – SLAVE master: lxsfrac03

2008/06/02 10:55:37 VCS INFO V-16-1-10298 Resource cvm_clus (Owner: unknown, Group: cvm) is online on lxsfrac04 (VCS initiated)

2008/06/02 10:55:37 VCS NOTICE V-16-1-10301 Initiating Online of Resource vxfsckd (Owner: unknown, Group: cvm) on System lxsfrac04

2008/06/02 10:55:39 VCS INFO V-16-1-10298 Resource vxfsckd (Owner: unknown, Group: cvm) is online on lxsfrac04 (VCS initiated)

2008/06/02 10:55:39 VCS NOTICE V-16-1-10447 Group cvm is online on system lxsfrac04

2008/06/02 10:55:39 VCS INFO V-16-10001-15051 (lxsfrac04) triggers:???:nfs_restart:Trigger does not do anything as there is no NFS/NFSLock/Share resource in the group

2008/06/02 10:55:39 VCS INFO V-16-6-15002 (lxsfrac04) hatrigger:hatrigger executed /opt/VRTSvcs/bin/triggers/nfs_restart cvmsuccessfully

总结:

通过该case熟悉了stale的原理及出解决方法。也注意更多的用hastatus来查看vcs状态。

其实很多东西都是相通的,在dns里就有各个zone的文本文件(也就是dns的“库文件“),我们做配置时改的都是这些文件,但真正生效的用户查询出结果的不是这个文本文件,而是通过文本文件加载到内存里的内容。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值