Linux AS 5.3 64 bit
Oracle 10.2.0.4 2 nodes
GFS file system
Node2 reboot abnormally .
node2 Linux Log :
Feb 4 16:14:46 --- reboot
Feb 4 16:18:57 --- ok
Feb 4 16:14:14 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:38732
Feb 4 16:14:29 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:51532
Feb 4 16:14:29 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:51532
Feb 4 16:14:30 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:51532
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:34969
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:34969
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Connection from UDP: [10.13.8.110]:1048
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [10.13.8.110]:1048
Feb 4 16:18:57 hou249bbodb3112 syslogd 1.4.1: restart.
Feb 4 16:18:57 hou249bbodb3112 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Feb 4 16:18:57 hou249bbodb3112 kernel: Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 rhgb quiet)
Feb 4 16:18:57 hou249bbodb3112 kernel: Linux version 2.6.18-128.1.16.el5xen ( mockbuild@hs20-bc1-2.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Fri Jun 26 11:10:46 EDT 2009
Feb 4 16:18:57 hou249bbodb3112 kernel: BIOS-provided physical RAM map:
Feb 4 16:18:57 hou249bbodb3112 kernel: Xen: 0000000000000000 - 00000003d7724000 (usable)
Feb 4 16:18:57 hou249bbodb3112 kernel: DMI 2.5 present.
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x10] lapic_id[0x10] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x18] lapic_id[0x18] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x11] lapic_id[0x11] enabled)
Node2 CRS Log :
2010-01-25 13:12:27.741
[crsd(10609)]CRS-1012:The OCR service started on node hou249bbodb3111.
2010-01-25 13:12:28.464
[evmd(10607)]CRS-1401:EVMD started on node hou249bbodb3111.
2010-01-25 13:12:29.902
[crsd(10609)]CRS-1201:CRSD started on node hou249bbodb3111.
2010-01-25 14:10:32.055
[cssd(11203)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 14.078 seconds
2010-01-25 14:10:33.051
[cssd(11203)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 13.088 seconds
2010-01-25 14:15:01.135
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdc. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
2010-01-25 14:15:01.137
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdd. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
2010-01-25 14:15:01.169
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdg. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 hou249bbodb3112 .
2010-01-25 14:15:07.842
[crsd(10106)]CRS-1005:The OCR upgrade was completed. Version has changed from 185599488 to 185599488. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/crsd/crsd.log.
2010-01-25 14:15:07.843
[crsd(10106)]CRS-1012:The OCR service started on node hou249bbodb3111.
2010-01-25 14:15:08.430
[evmd(10057)]CRS-1401:EVMD started on node hou249bbodb3111.
2010-01-25 14:15:12.687
[crsd(10106)]CRS-1201:CRSD started on node hou249bbodb3111.
2010-02-04 16:15:11.137
[cssd(10708)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 14.102 seconds
2010-02-04 16:15:12.234
[cssd(10708)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 13.102 seconds
2010-02-04 16:15:19.129
[cssd(10708)]CRS-1611:node hou249bbodb3112 (2) at 75% heartbeat fatal, eviction in 6.102 seconds
2010-02-04 16:15:23.129
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 2.102 seconds
2010-02-04 16:15:24.125
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 1.112 seconds
2010-02-04 16:15:25.129
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 0.102 seconds
2010-02-04 16:15:26.006
[cssd(10708)]CRS-1607:CSSD evicting node hou249bbodb3112. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 .
2010-02-04 16:15:30.531
[crsd(10106)]CRS-1204:Recovering CRS resources for node hou249bbodb3112.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 hou249bbodb3112 .
node1 crsd log :
hou249bbodb3111$vi crsd.log
2010-01-25 14:16:35.867: [ CRSRES][1504274752] startRunnable: setting CLI values
2010-01-25 14:16:36.108: [ CRSRES][1504274752] Attempting to start `ora.hou249bbodb3111.gsd` on member `hou249bbodb3111`
2010-01-25 14:16:36.473: [ CRSRES][1537845568] Attempting to start `ora.wmb2bprd.db` on member `hou249bbodb3112`
2010-01-25 14:16:37.146: [ CRSRES][1504274752] Start of `ora.hou249bbodb3111.gsd` on member `hou249bbodb3111` succeeded.
2010-01-25 14:16:37.420: [ CRSRES][1537845568] Start of `ora.wmb2bprd.db` on member `hou249bbodb3112` succeeded.
2010-02-04 16:15:26.098: [ CRSCOMM][1537845568] CLEANUP: Searching for connections to failed node hou249bbodb3112
2010-02-04 16:15:26.098: [ CRSEVT][1537845568] Processing member leave for hou249bbodb3112, incarnation: 145375564
2010-02-04 16:15:26.099: [ CRSD][1537845568] SM: recovery in process: 8
2010-02-04 16:15:26.099: [ CRSEVT][1537845568] Do failover for: hou249bbodb3112
2010-02-04 16:15:26.857: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.881: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.896: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.914: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.926: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.946: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:27.029: [ CRSRES][1087633728] startRunnable: setting CLI values
2010-02-04 16:15:27.045: [ CRSRES][1087633728] Attempting to start `ora.hou249bbodb3112.vip` on member `hou249bbodb3111`
2010-02-04 16:15:27.071: [ CRSRES][1504274752] startRunnable: setting CLI values
2010-02-04 16:15:27.123: [ CRSRES][1504274752] Attempting to start `ora.wmb2bprd.db` on member `hou249bbodb3111`
2010-02-04 16:15:27.276: [ CRSRES][1504274752] Start of `ora.wmb2bprd.db` on member `hou249bbodb3111` succeeded.
2010-02-04 16:15:30.518: [ CRSRES][1087633728] Start of `ora.hou249bbodb3112.vip` on member `hou249bbodb3111` succeeded.
2010-02-04 16:15:30.531: [ CRSEVT][1537845568] Post recovery done evmd event for: hou249bbodb3112
2010-02-04 16:15:30.532: [ CRSD][1537845568] SM: recoveryDone: 0
2010-02-04 16:15:30.537: [ CRSEVT][1537845568] Processing RecoveryDone
2010-02-04 16:19:52.049: [ OCRUTL][1283971392]u_freem: mem passed is null
2010-02-04 16:19:54.405: [ CRSD][1094658368] SM: rE2Ec: 4
2010-02-04 16:19:54.406: [ CRSRES][1537845568] StopResource: setting CLI values
2010-02-04 16:19:54.869: [ CRSD][1537845568] SM:dE2Ec: all E2E cmds done. 0
"crsd.log" 8463L, 600108C
Node1 Linux Log :
Feb 4 16:14:46 hou249bbodb3111 snmpd[5985]: Connection from UDP: [10.13.8.110]:1048
Feb 4 16:14:46 hou249bbodb3111 snmpd[5985]: Received SNMP packet(s) from UDP: [10.13.8.110]:1048
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.1: LIP reset occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.1: LIP occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.0: LIP reset occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.0: LIP occured (f7f7).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] The token was lost in the OPERATIONAL state.
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] Receive multicast socket recv buffer size (288000 bytes).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] Transmit multicast socket send buffer size (288000 bytes).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] entering GATHER state from 2.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering GATHER state from 0.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Creating commit token because I am the rep.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Saving state aru 16aa35 high seq received 16aa35
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Storing new sequence id for ring ac
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering COMMIT state.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering RECOVERY state.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] position [0] member 172.16.223.111:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] previous ring seq 168 rep 172.16.223.111
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] aru 16aa35 high delivered 16aa35 received flag 1
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Did not need to originate any messages in recovery.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Sending initial ORF token
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] CLM CONFIGURATION CHANGE
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] New Configuration:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.111)
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] Members Left:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.112)
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] Members Joined:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] CLM CONFIGURATION CHANGE
Feb 4 16:15:11 hou249bbodb3111 kernel: dlm: closing connection to node 2
Feb 4 16:15:12 hou249bbodb3111 openais[5501]: [CLM ] New Configuration:
Feb 4 16:15:12 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.111)
Feb 4 16:15:13 hou249bbodb3111 openais[5501]: [CLM ] Members Left:
Feb 4 16:15:13 hou249bbodb3111 fenced[5520]: hou249bbodb3112priv not a cluster member after 1 sec post_fail_delay
Feb 4 16:15:14 hou249bbodb3111 fenced[5520]: fencing node "hou249bbodb3112priv"
Feb 4 16:15:14 hou249bbodb3111 openais[5501]: [CLM ] Members Joined:
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [SYNC ] This node is within the primary component and will provide service.
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [TOTEM] entering OPERATIONAL state.
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [CLM ] got nodejoin message 172.16.223.111
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [CPG ] got joinlist message from node 1
Feb 4 16:15:28 hou249bbodb3111 fenced[5520]: fence "hou249bbodb3112priv" success
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Replayed 0 of 2 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: replays = 0, skips = 0, sames = 2
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Done
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Replayed 0 of 1 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: replays = 0, skips = 1, sames = 0
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Replayed 0 of 38 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: replays = 0, skips = 12, sames = 26
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Done
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Done
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Registering new address record for 10.18.223.117 on bond1.
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Withdrawing address record for 10.18.223.117 on bond1.
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Registering new address record for 10.18.223.117 on bond1.
[ 本帖最后由 tolywang 于 2010-2-5 13:59 编辑 ]
Oracle 10.2.0.4 2 nodes
GFS file system
Node2 reboot abnormally .
node2 Linux Log :
Feb 4 16:14:46 --- reboot
Feb 4 16:18:57 --- ok
Feb 4 16:14:14 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:38732
Feb 4 16:14:29 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:51532
Feb 4 16:14:29 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:51532
Feb 4 16:14:30 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:51532
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Connection from UDP: [127.0.0.1]:34969
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [127.0.0.1]:34969
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Connection from UDP: [10.13.8.110]:1048
Feb 4 16:14:46 hou249bbodb3112 snmpd[5979]: Received SNMP packet(s) from UDP: [10.13.8.110]:1048
Feb 4 16:18:57 hou249bbodb3112 syslogd 1.4.1: restart.
Feb 4 16:18:57 hou249bbodb3112 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Feb 4 16:18:57 hou249bbodb3112 kernel: Bootdata ok (command line is ro root=/dev/VolGroup00/LogVol00 rhgb quiet)
Feb 4 16:18:57 hou249bbodb3112 kernel: Linux version 2.6.18-128.1.16.el5xen ( mockbuild@hs20-bc1-2.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)) #1 SMP Fri Jun 26 11:10:46 EDT 2009
Feb 4 16:18:57 hou249bbodb3112 kernel: BIOS-provided physical RAM map:
Feb 4 16:18:57 hou249bbodb3112 kernel: Xen: 0000000000000000 - 00000003d7724000 (usable)
Feb 4 16:18:57 hou249bbodb3112 kernel: DMI 2.5 present.
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x10] lapic_id[0x10] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x18] lapic_id[0x18] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] enabled)
Feb 4 16:18:57 hou249bbodb3112 kernel: ACPI: LAPIC (acpi_id[0x11] lapic_id[0x11] enabled)
Node2 CRS Log :
2010-01-25 13:12:27.741
[crsd(10609)]CRS-1012:The OCR service started on node hou249bbodb3111.
2010-01-25 13:12:28.464
[evmd(10607)]CRS-1401:EVMD started on node hou249bbodb3111.
2010-01-25 13:12:29.902
[crsd(10609)]CRS-1201:CRSD started on node hou249bbodb3111.
2010-01-25 14:10:32.055
[cssd(11203)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 14.078 seconds
2010-01-25 14:10:33.051
[cssd(11203)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 13.088 seconds
2010-01-25 14:15:01.135
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdc. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
2010-01-25 14:15:01.137
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdd. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
2010-01-25 14:15:01.169
[cssd(10708)]CRS-1605:CSSD voting file is online: /dev/sdg. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 hou249bbodb3112 .
2010-01-25 14:15:07.842
[crsd(10106)]CRS-1005:The OCR upgrade was completed. Version has changed from 185599488 to 185599488. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/crsd/crsd.log.
2010-01-25 14:15:07.843
[crsd(10106)]CRS-1012:The OCR service started on node hou249bbodb3111.
2010-01-25 14:15:08.430
[evmd(10057)]CRS-1401:EVMD started on node hou249bbodb3111.
2010-01-25 14:15:12.687
[crsd(10106)]CRS-1201:CRSD started on node hou249bbodb3111.
2010-02-04 16:15:11.137
[cssd(10708)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 14.102 seconds
2010-02-04 16:15:12.234
[cssd(10708)]CRS-1612:node hou249bbodb3112 (2) at 50% heartbeat fatal, eviction in 13.102 seconds
2010-02-04 16:15:19.129
[cssd(10708)]CRS-1611:node hou249bbodb3112 (2) at 75% heartbeat fatal, eviction in 6.102 seconds
2010-02-04 16:15:23.129
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 2.102 seconds
2010-02-04 16:15:24.125
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 1.112 seconds
2010-02-04 16:15:25.129
[cssd(10708)]CRS-1610:node hou249bbodb3112 (2) at 90% heartbeat fatal, eviction in 0.102 seconds
2010-02-04 16:15:26.006
[cssd(10708)]CRS-1607:CSSD evicting node hou249bbodb3112. Details in /u01/app/oracle/product/crs/log/hou249bbodb3111/cssd/ocssd.log.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 .
2010-02-04 16:15:30.531
[crsd(10106)]CRS-1204:Recovering CRS resources for node hou249bbodb3112.
[cssd(10708)]CRS-1601:CSSD Reconfiguration complete. Active nodes are hou249bbodb3111 hou249bbodb3112 .
node1 crsd log :
hou249bbodb3111$vi crsd.log
2010-01-25 14:16:35.867: [ CRSRES][1504274752] startRunnable: setting CLI values
2010-01-25 14:16:36.108: [ CRSRES][1504274752] Attempting to start `ora.hou249bbodb3111.gsd` on member `hou249bbodb3111`
2010-01-25 14:16:36.473: [ CRSRES][1537845568] Attempting to start `ora.wmb2bprd.db` on member `hou249bbodb3112`
2010-01-25 14:16:37.146: [ CRSRES][1504274752] Start of `ora.hou249bbodb3111.gsd` on member `hou249bbodb3111` succeeded.
2010-01-25 14:16:37.420: [ CRSRES][1537845568] Start of `ora.wmb2bprd.db` on member `hou249bbodb3112` succeeded.
2010-02-04 16:15:26.098: [ CRSCOMM][1537845568] CLEANUP: Searching for connections to failed node hou249bbodb3112
2010-02-04 16:15:26.098: [ CRSEVT][1537845568] Processing member leave for hou249bbodb3112, incarnation: 145375564
2010-02-04 16:15:26.099: [ CRSD][1537845568] SM: recovery in process: 8
2010-02-04 16:15:26.099: [ CRSEVT][1537845568] Do failover for: hou249bbodb3112
2010-02-04 16:15:26.857: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.881: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.896: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.914: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.926: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:26.946: [ CRSRES][1537845568] startup = 0
2010-02-04 16:15:27.029: [ CRSRES][1087633728] startRunnable: setting CLI values
2010-02-04 16:15:27.045: [ CRSRES][1087633728] Attempting to start `ora.hou249bbodb3112.vip` on member `hou249bbodb3111`
2010-02-04 16:15:27.071: [ CRSRES][1504274752] startRunnable: setting CLI values
2010-02-04 16:15:27.123: [ CRSRES][1504274752] Attempting to start `ora.wmb2bprd.db` on member `hou249bbodb3111`
2010-02-04 16:15:27.276: [ CRSRES][1504274752] Start of `ora.wmb2bprd.db` on member `hou249bbodb3111` succeeded.
2010-02-04 16:15:30.518: [ CRSRES][1087633728] Start of `ora.hou249bbodb3112.vip` on member `hou249bbodb3111` succeeded.
2010-02-04 16:15:30.531: [ CRSEVT][1537845568] Post recovery done evmd event for: hou249bbodb3112
2010-02-04 16:15:30.532: [ CRSD][1537845568] SM: recoveryDone: 0
2010-02-04 16:15:30.537: [ CRSEVT][1537845568] Processing RecoveryDone
2010-02-04 16:19:52.049: [ OCRUTL][1283971392]u_freem: mem passed is null
2010-02-04 16:19:54.405: [ CRSD][1094658368] SM: rE2Ec: 4
2010-02-04 16:19:54.406: [ CRSRES][1537845568] StopResource: setting CLI values
2010-02-04 16:19:54.869: [ CRSD][1537845568] SM:dE2Ec: all E2E cmds done. 0
"crsd.log" 8463L, 600108C
Node1 Linux Log :
Feb 4 16:14:46 hou249bbodb3111 snmpd[5985]: Connection from UDP: [10.13.8.110]:1048
Feb 4 16:14:46 hou249bbodb3111 snmpd[5985]: Received SNMP packet(s) from UDP: [10.13.8.110]:1048
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.1: LIP reset occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.1: LIP occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.0: LIP reset occured (f7f7).
Feb 4 16:14:57 hou249bbodb3111 kernel: qla2xxx 0000:0d:00.0: LIP occured (f7f7).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] The token was lost in the OPERATIONAL state.
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] Receive multicast socket recv buffer size (288000 bytes).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] Transmit multicast socket send buffer size (288000 bytes).
Feb 4 16:15:06 hou249bbodb3111 openais[5501]: [TOTEM] entering GATHER state from 2.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering GATHER state from 0.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Creating commit token because I am the rep.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Saving state aru 16aa35 high seq received 16aa35
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Storing new sequence id for ring ac
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering COMMIT state.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] entering RECOVERY state.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] position [0] member 172.16.223.111:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] previous ring seq 168 rep 172.16.223.111
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] aru 16aa35 high delivered 16aa35 received flag 1
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Did not need to originate any messages in recovery.
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [TOTEM] Sending initial ORF token
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] CLM CONFIGURATION CHANGE
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] New Configuration:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.111)
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] Members Left:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.112)
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] Members Joined:
Feb 4 16:15:11 hou249bbodb3111 openais[5501]: [CLM ] CLM CONFIGURATION CHANGE
Feb 4 16:15:11 hou249bbodb3111 kernel: dlm: closing connection to node 2
Feb 4 16:15:12 hou249bbodb3111 openais[5501]: [CLM ] New Configuration:
Feb 4 16:15:12 hou249bbodb3111 openais[5501]: [CLM ] r(0) ip(172.16.223.111)
Feb 4 16:15:13 hou249bbodb3111 openais[5501]: [CLM ] Members Left:
Feb 4 16:15:13 hou249bbodb3111 fenced[5520]: hou249bbodb3112priv not a cluster member after 1 sec post_fail_delay
Feb 4 16:15:14 hou249bbodb3111 fenced[5520]: fencing node "hou249bbodb3112priv"
Feb 4 16:15:14 hou249bbodb3111 openais[5501]: [CLM ] Members Joined:
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [SYNC ] This node is within the primary component and will provide service.
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [TOTEM] entering OPERATIONAL state.
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [CLM ] got nodejoin message 172.16.223.111
Feb 4 16:15:15 hou249bbodb3111 openais[5501]: [CPG ] got joinlist message from node 1
Feb 4 16:15:28 hou249bbodb3111 fenced[5520]: fence "hou249bbodb3112priv" success
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Trying to acquire journal lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Looking at journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Acquiring the transaction lock...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Replayed 0 of 2 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: replays = 0, skips = 0, sames = 2
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata1.1: jid=0: Done
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Replaying journal...
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Replayed 0 of 1 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: replays = 0, skips = 1, sames = 0
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Replayed 0 of 38 blocks
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: replays = 0, skips = 12, sames = 26
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata.1: jid=0: Done
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Journal replayed in 1s
Feb 4 16:15:28 hou249bbodb3111 kernel: GFS: fsid=b2bgfs_cluster:gfs-b2bdata2.1: jid=0: Done
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Registering new address record for 10.18.223.117 on bond1.
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Withdrawing address record for 10.18.223.117 on bond1.
Feb 4 16:15:30 hou249bbodb3111 avahi-daemon[8110]: Registering new address record for 10.18.223.117 on bond1.
[ 本帖最后由 tolywang 于 2010-2-5 13:59 编辑 ]
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/35489/viewspace-626928/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/35489/viewspace-626928/