2.4 Configuring Cluster Resources
In a RAC configuration only one resource group is required. This resource group is a concurrent group for the shared volume group. The following are the steps to add a concurrent resource group for a shared volume group:
First there needs to be a volume group that is shared between the nodes.
SHARED LOGICAL VOLUME MANAGER , SHARED CONCURRENT DISKS ( NO VSD )
The two instances of the same cluster database have a concurrent access on the same external disks. This is real concurrent access and not a shared one like in the VSD environment. Because several instances access at the same time the same files and data, locks have to be managed. These locks, at the CLVM layer (including memory cache), are managed by HACMP.
1) Check if the target disks are physically linked to the two machines of the cluster, and seen by both.
Type the lspv command on both machines.
Note : the hdisk number can be different, depending on the others nodes disk configurations. Use the second field of the output (PVid) of lspv to be sure you are dealing with the same physical disk from two hosts. Although hdisk inconsistency may not be a problem IBM suggests using ghost disks to ensure hdisk numbers match between the nodes. Contact IBM for further information on this topic.
2.4.1 Create volume groups to be shared concurrently on one node
# smit vg
Select "Add a Volume Group "
Type or select values in entry fields.
Add a Volume Group
Type or select values in entry fields.
Press Enter AFTER
making all desired changes.
[Entry Fields]
VOLUME GROUP name
[oracle_vg]
Physical partition SIZE in megabytes 32 +
*
PHYSICAL VOLUME names [hdisk5] +
Activate volume group AUTOMATICALLY
no +
at system restart?
Volume Group MAJOR NUMBER [57]
+#
Create VG Concurrent Capable? yes +
Auto-varyon in Concurrent Mode? no +
The "PHYSICAL VOLUME names " must be physical disks that are shared between the nodes. We do not want the volume group automatically activated at system startup because HACMP activates it. Also "Auto-varyon in Concurrent Mode? " should be set to "no " because HACMP varies it on in concurrent mode.
You must choose the major number to be sure the volume groups have the same major number in all the nodes (attention, before choosing this number, you must be sure it’s free on all the nodes).
To check all defined major number, type:
% ls –al /dev/*
crw-rw---- 1 root system 57, 0 Aug 02 13:39 /dev/oracle_vg
The major number for oracle_vg volume group is 57. Ensure that 57 is available on all the other nodes and is not used by another device. If it is free then make use of the same on all nodes.
On this volume group, create all the logical volumes and file systems you need for the cluster database.
2.4.2 Create Shared RAW Logical Volumes if not using GPFS. See section 2.4.6 for details about GPFS.
mklv -y'db_name_cntrl1_110m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_cntrl2_110m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_system_400m ' -w'n' -s'n' -r'n' usupport_vg 13 hdisk5
mklv -y'db_name_users_120m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_drsys_90m ' -w'n' -s'n' -r'n' usupport_vg 3 hdisk5
mklv -y'db_name_tools_12m ' -w'n' -s'n' -r'n' usupport_vg 1 hdisk5
mklv -y'db_name_temp_100m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_undotbs1_312m ' -w'n' -s'n' -r'n' usupport_vg 10 hdisk5
mklv -y'db_name_undotbs2_312m ' -w'n' -s'n' -r'n' usupport_vg 10 hdisk5
mklv -y'db_name_log11_120m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_log12_120m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_log21_120m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_log22_120m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_indx_70m ' -w'n' -s'n' -r'n' usupport_vg 3 hdisk5
mklv -y'db_name_cwmlite_100m' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
mklv -y'db_name_example_160m ' -w'n' -s'n' -r'n' usupport_vg 5 hdisk5
mklv -y'db_name_oemrepo_20m ' -w'n' -s'n' -r'n' usupport_vg 1 hdisk5
mklv -y'db_name_spfile_5m ' -w'n' -s'n' -r'n' usupport_vg 1 hdisk5
mklv -y'db_name_srvmconf_100m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5
Substitute your database name in place of the "db_name" value. When the volume group was created a partition size of 32 megabytes was used. The seventh field is the number of partitions that make up the file so for example if "db_name_cntrl1_110m" needs to be 110 megabytes we would need 4 partitions.
The raw partitions are created in the "/dev" directory and it is the character devices that will be used. The " mklv -y'db_name_cntrl1_110m ' -w'n' -s'n' -r'n' usupport_vg 4 hdisk5 " creates two files:
/dev/db_name_cntrl1_110m
/dev/rdb_name_cntrl1_110m
Change the permissions on the character devices so the software owner owns them:
# chown oracle:dba /dev/rdb_name*
2.4.3 Import the Volume Group on to the Other Nodes
Use "importvg" to import the oracle_vg volume group on all of the other nodes
On the first machine, type:
% varyoffvg oracle_vg
On the other nodes, import the definition of the volume group using "smit vg " :
Select "Import a Volume Group "
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
Import a Volume Group
Type or select values in entry fields.
Press Enter AFTER
making all desired changes.
[Entry Fields]
VOLUME GROUP name
[oracle_vg]
* PHYSICAL VOLUME name [hdisk5] +
Volume Group MAJOR NUMBER [57] +#
Make this VG Concurrent Capable? no +
Make default varyon of VG Concurrent? no +
It is possible that the physical volume name (hdisk) could be different on each node. Check the PVID of the disk using "lspv ", and be sure to pick the hdisk that has the same PVID as the disk used to create the volume group on the first node. Also make sure the same major number is used as well.. This number has to be undefined on all the nodes. The "Make default varyon of VG Concurrent? " option should be set to "no". The volume group was created concurrent capable so the option "Make this VG Concurrent Capable? " can be left at "no". The command line for importing the volume group after varying it off on the node where the volume group was orginally created on would be:
% importvg -V<major #> -y <vgname> h disk#
% chvg -an <vgname>
% varyoffvg <vgname>
After importing the volume group onto each node be sure to change the ownership of the character devices to the software owner:
# chown oracle:dba /dev/rdb_name*
2.4.4 Add a Concurrent Cluster Resource Group
The shared resource in this example is "oracle_vg". To create the concurrent resource group that will manage "oracle_vg" do the following:
Smit HACMP -> Cluster Configuration -> Cluster Resources -> Define Resource Groups -> Add a Resource Group
FastPath:
# smit cm_add_grp
Add a Resource Group
Type or select values in entry fields.
Press Enter AFTER
making all desired changes.
[Entry Fields]
* Resource Group
Name [shared_vg]
* Node Relationship concurrent +
* Participating Node Names [node1 node2] +
The "Resource Group Name " is arbitrary and is used when selecting the resource group for configuration. Because we are configuring a shared resources the "Node Relationship " is "concurrent" meaning a group of nodes that will share the resource. "Participating Node Names " is a space separated list of the nodes that will be sharing the resource.
2.4.5 Configure the Concurrent Cluster Resource Group
Once the resource group is added it can then be configured with:
Smit HACMP -> Cluster Configuration -> Cluster Resources -> Change/Show Resources for a Resource Group
FastPath:
# smit cm_cfg_res.select
Configure Resources for a Resource Group
Type or select values in entry fields.
Press
Enter AFTER making all desired changes.
[TOP] [Entry Fields]
Resource
Group Name concurrent_group
Node Relationship concurrent
Participating Node Names opcbaix1 opcbaix2
Service IP label [] +
Filesystems [] +
Filesystems Consistency Check fsck +
Filesystems Recovery Method sequential +
Filesystems to Export [] +
Filesystems to NFS mount [] +
Volume Groups [] +
Concurrent Volume groups [oracle_vg] +
Raw Disk PVIDs [00041486eb90ebb7] +
AIX Connections Service [] +
AIX Fast Connect Services [] +
Application Servers [] +
Highly Available Communication Links [] +
Miscellaneous Data []
Inactive Takeover Activated false +
9333 Disk Fencing Activated false +
SSA Disk Fencing Activated false +
Filesystems mounted before IP configured false +
[BOTTOM]
Note that the settings for "Resource Group Name ", "Node Relationship " and "Participating Node Names " comes from the data entered in the previous menu. "Concurrent Volume groups " needs to be a pre-created volume group on shared storage. The "Raw Disk PVIDs " are the physical volumes IDs for each of the disks that make up the "Concurrent Volume groups ". It is important to note that you a resource group manage multiple concurrent resources. In such a case separate each volume group name with a space. Also, the "Raw Disk PVIDs " will be a space delimited list of all the physical volume IDs that make up the concurrent volume group list. Alternatively each volume group can be configured in its own concurrent resource group.
2.4.6 Creating Parallel Filesystems (GPFS)
With AIX 5.1 (5L) you can also place your files on GPFS (RAW Logical Volumes are not a requirement of GPFS). In this case
create GPFS capable of holding all required Database Files, Controlfiles and Logfiles.
2.5 Synchronizing the Cluster Resources
After configuring the resource group a resource synchronization is needed.
Smit HACMP -> Cluster Configuration -> Cluster Resources -> Synchronize Cluster Resources
FastPath:
# smit clsyncnode.dialog
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[TOP]
[Entry Fields]
Ignore Cluster Verification Errors? [No]
+
Un/Configure Cluster Resources? [Yes] +
* Emulate or Actual?
[Actual] +
Note:
Only the local node's default configuration files
keep the
changes you make for resource DARE
emulation. Once you run your emulation, to
restore the original configuration rather
than
running an actual DARE, run the SMIT command,
"Restore System Default Configuration from Active
Configuration."
We recommend that you make a snapshot before
running an emulation, just in case uncontrolled
cluster events happen during emulation.
[BOTTOM]
Just keep the defaults.
2.6 Joining Nodes Into the Cluster
After the cluster topology and resources are configured the nodes can join the cluster. It is important to start one node at a time unless using C-SPOC (Cluster-Single Poing of Control). For more information on using C-SPOC consult IBM's HACMP specific documentation. The use of C-SPOC will not be covered in this document.
Start cluster services by doing the following:
Smit HACMP -> Cluster Services -> Start Cluster Services
FastPath:
# smit clstart.dialog
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
* Start now, on system restart or both now
+
BROADCAST message at startup? false +
Startup Cluster Lock Services?
false +
Startup Cluster Information Daemon? true +
Setting "Start now, on system restart or both " to "now " will start the HACMP daemons immediately. "restart " will update the "/etc/inittab" with an entry to start the daemons at reboot and "both " will do exactly that, update the "/etc/inittab" and start the daemons immediately. "BROADCAST message at startup? " can either be "true " or "false ". If set to "true " wall type message will be displayed when the node is joining the cluster. "Startup Cluster Lock Services? " should be set to "false " for a RAC configuration. Setting this parameter to "true " will prevent the cluster from working but the added daemon is not used. If "clstat" is going to be used to to monitor the cluster the "Startup Cluster Information Daemon?" will need to be set to "true ".
View the "/etc/hacmp.out" file for startup messages. When you see something similar to the following it is safe to start the cluster services on the other nodes:
May 23 09:31:43 EVENT COMPLETED: node_up_complete node1
When joining nodes into the cluster the other nodes will report a successful join in their "/tmp/hacmp.out" files:
May 23 09:34:11 EVENT COMPLETED: node_up_complete node1
2.7 Basic Cluster Administration
The "/tmp/hacmp.out" is the best place to look for cluster information. "clstat" can also be used to verify cluster health. The "clstat" program can take a while to update with the latest cluster information and at times does not work at all. Also you must have the "Startup Cluster Information Daemon? " set to "true " when starting cluster services. Use the following command to start "clstat":
# /usr/es/sbin/cluster/clstat
clstat - HACMP for AIX Cluster Status Monitor
---------------------------------------------
Cluster:
cluster1 (0) Tue Jul 2 08:38:06 EDT 2002
State: UP Nodes: 2
SubState: STABLE
Node: node1 State: UP
Interface: node1 (0) Address: 192.168.0.1
State: UP
Node: node2 State: UP
Interface: node2 (0) Address: 192.168.0.2
State: UP
One other way to check the cluster status is by querying the "snmpd" daemon with "snmpinfo":
# /usr/sbin/snmpinfo -m get -o /usr/es/sbin/cluster/hacmp.defs -v ClusterSubstate.0
This should return "32":
clusterSubState.0 = 32
If other values are returned from any node consult your IBM HACMP documentation or contact IBM support.
You can get a quick view of the HACMP specific daemons with:
Smit HACMP -> Cluster Services -> Show Cluster Services
COMMAND STATUS
Command: OK stdout: yes stderr: no
Before
command completion, additional instructions may appear below.
Subsystem Group PID Status
clstrmgrES
cluster 22000 active
clinfoES cluster 21394 active
clsmuxpdES cluster 14342 active
cllockdES lock inoperative
clresmgrdES 29720 active
Starting & Stopping Cluster Nodes
To join and evict nodes from the cluster use:
Smit HACMP -> Cluster Services -> Start Cluster Services
See section 2.6 for more information on joining a node into the cluster.
Use the following to evict a node from the cluster:
Smit HACMP -> Cluster Services -> Stop Cluster Services
FastPath:
# smit clstop.dialog
Stop Cluster Services
Type or select values in entry fields.
Press Enter AFTER
making all desired changes.
[Entry Fields]
* Stop now, on system
restart or both now +
BROADCAST cluster shutdown? true +
* Shutdown mode graceful +
(graceful or graceful with takeover, forced)
See section 2.6 "Joining Nodes Into the Cluster" for and explanation of "Stop now, on system restart or both " and "BROADCAST cluster shutdown? ". The "Shutdown mode" determines whether or not resources are going to move between nodes if a shutdown occurs. "forced " is new with 4.4.1 of HACMP and will leave applications running that are controlled by HACMP events when the shutdown occurs. "graceful " will bring everything down but cascading and rotating resources are not switched where as with "graceful with takeover " these resources will be switched at shutdown.
Log Files for HACMP/ES
All cluster reconfiguration information during cluster startup and shutdown goes into the "/tmp/hacmp.out".