0. Introduction
What Is a Cluster?
Interconnected nodes act as a single server
Cluster software hides the structure
Disks are available for read and write by all nodes.
Operating system is the same on each machine
What Is Oracle Real Application Clusters?
Multiple instances accessing the same database
One Instance per node
Physical or logical access to each database file
Software-controlled data access
Why Use RAC
High availability
Scalability
Pay as you grow
Key grid computing features
Levels of Scalability
Hardware: Disk input/output (I/O)
Internode communication: High bandwidth and low latency
Operating system: Number of CPUs
Database management system: Synchronization
Application: Design
Scaleup and Speedup
Global Resources Coordination
RAC使用Global Resource Directory (GRD)来记录数据库中资源的使用情况。每个instance管理一部分GRD(resource master)。
Global Cache Services (GCS)-负责在多个instance间维持数据库的多份copy的一致性。使用Cache Fusion算法。
Global Enqueue Services (GES)-负责维持instance间除cache fusion外的其他资源并跟踪enqueue的机制。
Global Cache Coordination: Example
1. The second instance attempting to modify the block submits a request to the GCS.
2. The GCS transmits the request to the holder.
3. The first instance receives the message and sends the block to the second instance. The first instance retains the dirty buffer for recovery purposes (past image).
4. On receipt of the block, the second instance informs the GCS that it holds the block.
Write to Disk Coordination: Example
1. The first instance sends a write request to the GCS.
2. The GCS forwards the request to the holder of the current version of the block.
3. The second instance receives the write request and writes the block to disk.
4. The second instance records the completion of the write operation with the GCS.
5. The GCS orders all past image holders to discard their past images.
Dynamic Reconfiguration
当有节点离开或加入cluster,GRD会重新分配。使用lazy remastering算法,只重新分配最小部分的GRD。同时,所有instance在GRD的grant情况中把所有对于失败的instance的引用都去除。
Object Affinity and Dynamic Remastering
Dynamic remastering:GCS会记录哪些instance经常访问哪些objects,必要时根据访问频度调整GRD的分配。
Global Dynamic Performance Views
GV$视图-将V$视图按instance整合的全局性视图。
使用特殊的并行机制获取:coordinator运行在客户端连接的instance,其他instance上各起一个并行进程。
Additional Memory Requirement for RAC
•Heuristics for scalability cases:
–15% more shared pool
–10% more buffer cache
可通过V$RESOURCE_LIMIT中关于ges和gcs的相关统计
Efficient Internode Row-Level Locking
Block的传输不受row-level lock的影响
Parallel Execution with RAC
并行执行一般在一个instance上起并行进程,但如有需要也可扩展到其他instance上。
RAC Software Principles
Additional background processes
•LMON: Global Enqueue Service Monitor
•LMD0: Global Enqueue Service Daemon
•LMSx: Global Cache Service Processes, where x can range from 0 to j
•LCK0: Lock process
•DIAG: Diagnosability process
Main processes of Oracle Clusterware
•CRSD and RACGIMON: Are engines for high-availability operations
•OCSSD: Provides access to node membership and group services
•EVMD: Scans callout directory and invokes callouts in reactions to detected events
•OPROCD: Is a process monitor for the cluster (not used on Linux and Windows)
RAC Software Storage Principles
CRS_HOME
安装在本地存储
ORACLE_HOME
ASM_HOME
可安装在本地存储或共享存储,但在本地存储上可实现滚动升级
Voting files: Is essentially used by the Cluster Synchronization Services daemon for node-monitoring information across the cluster. Its size is set to around 20 MB.
OCR files: It maintains information about the high-availability components in your cluster, such as the cluster node list, cluster database instance to node mapping, and CRS application resource profiles (such as services, Virtual Interconnect Protocol addresses, and so on). This file is maintained by administrative tools such as SRVCTL. Its size is around 100 MB.
以上两项在ASM实例起来前就会使用,因此不能存放于ASM存储,必须是raw device或Cluster File System
Data files
Temp files
Control files
Flash recovery area files
Change tracking file
SPFILE
TDE Wallet
以上必须存放在共享存储,可以是ASMraw deviceCFS,多个实例公用
Undo tablespace
Online redo log files
以上必须存放在共享存储,可以是ASMraw deviceCFS,每个实例独占
Archive logs
不能存放于raw device,可不存放在共享存储,但在做recovery时必须使其他实例能访问(如使用NFS)
Typical Cluster Stack with RAC
Unix和Linux平台使用UDP on Gigabit Ethernet (GbE)作为节点间通信协议
使用Oracle的clusterware可以减少安装和支持的复杂度,但如使用no-Ethernet的连接或部署了依赖于clusterware的其他应用,则需要安装vendor clusterware。
RAC and Services
可用service将应用分解成多个逻辑上独立的系统,更好地进行负载均衡、优先级控制、性能监控等。(handled by instance using metrics, alerts, scheduler job classes and resource manager.
1. Oracle Clusterware Installation and Configuration
Oracle RAC 10g Installation
–Phase one installs Oracle Clusterware.
–Phase two installs the Oracle Database 10g software with RAC.
Oracle RAC 10g Installation: Outline
1. Complete preinstallation tasks:
–Hardware requirements
–Software requirements
–Environment configuration, kernel parameters, and so on
2. Perform Oracle Clusterware installation.
3. Perform ASM installation.
4. Perform Oracle Database 10g software installation.
5. Install EM agent on cluster nodes.
6. Perform cluster database creation.
7. Complete postinstallation tasks.
Windows and UNIX Installation Differences
•Startup and shutdown services
•Environment variables
•DBA account for database administrators
•Account for running the OUI
Preinstallation Tasks
Check system requirements.硬件配置、网络配置、共享存储
Check software requirements.操作系统版本和相关包、hangcheck-timer Module-Linux必须、安装OCFS程序包-Linux,可选
Check kernel parameters.核心参数
Create groups and users.创建用户和组、放开系统限制、配置远程cluster安装(SSH)
Perform cluster setup.
Virtual IP Addresses and RAC
使用虚IP来配置tns服务名-一个接点down时,另一个节点自动接管虚IP,即刻返回错误并使客户端使用其他地址重连,无须等待网络超时时间。
Verifying Cluster Setup with cluvfy
可用于进行一些preinstallation或postinstallation的check
Verifying the Oracle Clusterware Installation
检查/etc/inittab文件中是否加如了evmd、cssd、crsd进程的自动启动2. RAC Software Installation
3. RAC Database Creation
Database Services
Transparent Application Failover (TAF) policy
•None: Do not use TAF.
•Basic: Establish connections at failover time.
•Pre-connect: Establish one connection to a preferred instance and another connection to a backup instance that you have selected to be available.
Single Instance to RAC Conversion
•Single-instance databases can be converted to RAC using:
–DBCA
–Enterprise Manager
–RCONFIG utility
•Before conversion, ensure that:
–Your hardware and operating system are supported
–Your cluster nodes have access to shared storage
Single-Instance Conversion Using the DBCA
Conversion steps for a single-instance database on nonclustered hardware:
1. Back up the original single-instance database
使用dbca创建模板
文件路径可选,默认为$ORACLE_HOME/assistants/dbca/templates/下
选择“Maintain the file locations”以便于可将文件restore到当前路径
生成文件template_name.dbc(数据库结构文件)&template_name.dfb(数据库镜像文件)
2. Perform the preinstallation steps.
3. Set up and validate the cluster.
4. Copy the preconfigured database image.
5. Install the Oracle Database 10g software with Real Application Clusters.
选择dbca template selection->Copy the Preconfigured Database Image
Single-Instance Conversion Using rconfig
1. Edit the ConvertToRAC.xml file located in the
$ORACLE_HOME/assistants/rconfig/sampleXMLs directory.
2. Modify the parameters in the ConvertToRAC.xml file as required for your system.
3. Save the file under a different name.
rconfig my_rac_conversion.xml
Single-Instance Conversion Using Grid Control[@more@]