1 Introduction to Oracle Clusterware Oracle集群软件介绍

1 Introduction to Oracle Clusterware

This chapter includes the following topics:

What is Oracle Clusterware?

Oracle Clusterware enables servers to communicate with each other, so that they appear to function as a collective unit. This combination of servers is commonly known as a cluster. Although the servers are standalone servers, each server has additional processes that communicate with other servers. In this way the separate servers appear as if they are one system to applications and end users.

Oracle集群可以通过server相互之间进行通信,这样他们看上去就像一个团体集合在工作一样。这些server的集合通常就认为是一个集群。虽然server是一个单机的server,每个server有额外的进程来和其他server通信。但就是这样这些分开独立的server对应用和终端用户才看上去像是一个系统。

Oracle Clusterware provides the infrastructure necessary to run Oracle Real Application Clusters (Oracle RAC). Oracle Clusterware also manages resources, such as virtual IP (VIP) addresses, databases, listeners, services, and so on. These resources are generally named ora.host_name.resource_name. Oracle does not support editing these resources except under the explicit direction of My Oracle Support.

Oracle集群为Oracle实时应用集群(RAC)提供了必要的基础条件。Oracle集群也管理resource,比如虚拟ip(vip),数据库,监听,services等等。这些资源通常被命名为 ora.host_name.resource_name。Oracle不支持编辑这些资源除非是在有Oracle Support提供明确指导的前提条件下。

Figure 1-1 shows a configuration that uses Oracle Clusterware to extend the basic single-instance Oracle Database architecture. In Figure 1-1, the cluster is running Oracle Database and is actively servicing applications and users. Using Oracle Clusterware, you can use the same high availability mechanisms to make your Oracle database and your custom applications highly available.

图1展示了配置详情,其中使用了Oracle集群来扩展级别的单实例数据库架构。在图1中,集群正在运行数据库,并且正在实时的服务应用和用户。使用Oracle集群,你可以使用一些高可用的机制来使你的数据库和你的定制化应用保持高可用。

Figure 1-1 Oracle Clusterware Configuration

Description of Figure 1-1 follows
Description of "Figure 1-1 Oracle Clusterware Configuration"

The benefits of using a cluster include:

使用集群的好处有:

  • Scalability of applications

  • Reduce total cost of ownership for the infrastructure by providing a scalable system with low-cost commodity hardware

  • Ability to fail over

  • Increase throughput on demand for cluster-aware applications, by adding servers to a cluster to increase cluster resources

  • Increase throughput for cluster-aware applications by enabling the applications to run on all of the nodes in a cluster

  • Ability to program the startup of applications in a planned order that ensures dependent processes are started

  • Ability to monitor processes and restart them if they stop

  • Eliminate unplanned downtime due to hardware or software malfunctions

  • Reduce or eliminate planned downtime for software maintenance

    应用的扩展性

    通过提供使用低成本的商用硬件的可扩展系统来减少整个集群的所有权的开销

    拥有故障转移的能力

    通过为集群添加服务器来增加集群的资源,增加一些集群敏感的应用的吞吐量

    通过让应用运行在所有节点上来提供吞吐量

    可以通过一个有计划的顺序来编写应用的启动脚本使得相关的依赖进程启动

    可以监控进程,如果进程停了,就重启进程

    消除未知的因为软件或者硬件故障带来的非计划性停机

    减少或者消除因为软件维护带来的计划性的停机

You can program Oracle Clusterware to manage the availability of user applications and Oracle databases. In an Oracle RAC environment, Oracle Clusterware manages all of the resources automatically. All of the applications and processes that Oracle Clusterware manages are either cluster resources or local resources.

你可以设计Oracle集群来管理你的应用和数据库的可用性,在RAC环境中,Oracle集群自动管理所有的资源。所有的集群管理的应用和进程要么是集群资源(cluster resources),要么是本地资源(local resources)

Oracle Clusterware is required for using Oracle RAC; it is the only clusterware that you need for platforms on which Oracle RAC operates. Although Oracle RAC continues to support many third-party clusterware products on specific platforms, you must also install and use Oracle Clusterware. Note that the servers on which you want to install and run Oracle Clusterware must use the same operating system.

使用Oracle Rac需要使用Oracle集群软件;它是你为Rac搭建平台所需要的唯一集群软件。虽然Oracle RAC还是在一些指定平台上,继续支持许多第三方的集群软件,但你必须安装和使用Oracle集群软件。注意,那些你想要安装的服务和运行的Oracle集群软件必须是同一个操作系统。

Using Oracle Clusterware eliminates the need for proprietary vendor clusterware and provides the benefit of using only Oracle software. Oracle provides an entire software solution, including everything from disk management with Oracle Automatic Storage Management (Oracle ASM) to data management with Oracle Database and Oracle RAC. In addition, Oracle Database features, such as Oracle Services, provide advanced functionality when used with the underlying Oracle Clusterware high availability framework.

使用Oracle集群软件就不需要再那些专门的集群软件提供商了,还提供了一体化Oracle软件的好处。Oracle提供了一整套的软件解决方案,包括所有的磁盘管理解决方案(ASM)和ASM/RAC上的数据管理解决方案。另外,Oracle数据库特性,比如Oracle Services,当和底层的Oracle Clusterware 高可用框架并用时,提供了更高级的功能

Oracle Clusterware has two stored components, besides the binaries: The voting disk files, which record node membership information, and the Oracle Cluster Registry (OCR), which records cluster configuration information. Voting disks and OCRs must reside on shared storage available to all cluster member nodes.

Oracle集群软件除了二进制代码有有两个内置的组件:表决磁盘文件,记录了节点间的关系信息,还有Oracle集群注册表(OCR),记录了集群配置信息。表决磁盘和OCR必须存放在对所有集群几点可访问的共享存储上。

Understanding System Requirements for Oracle Clusterware

To use Oracle Clusterware, you must understand the hardware and software concepts and requirements as described in the following sections:

Oracle Clusterware Hardware Concepts and Requirements

Note:

Many hardware providers have validated cluster configurations that provide a single part number for a cluster. If you are new to clustering, then use the information in this section to simplify your hardware procurement efforts when you purchase hardware to create a cluster.
许多硬件提供商已经校验了集群的配置,这些配置提供了集群中的一个独立部分。如果你你新添加到集群,那么使用这些信息可以在当你为创建集群采购硬件的时候,简化你的硬件采购麻烦。

A cluster consists of one or more servers. The hardware in a server in a cluster (or cluster member or node) is similar to a standalone server. However, a server that is part of a cluster, otherwise known as a node or a cluster member, requires a second network. This second network is referred to as the interconnect. For this reason, cluster member nodes require at least two network interface cards: one for a public network and one for a private network. The interconnect network is a private network using a switch (or multiple switches) that only the nodes in the cluster can access.Foot 1 

一个集群包括一个或者多个服务器。集群中的服务器上的硬件基本和单机服务器是相似的。然而,一个服务器是集群中的一部分,一般被认为是一个节点或者一个集群成员,集群是需要第二个网络环境的。第二个网络指的是内部通信网络。因为这个原因,集群成员节点需要至少两个网卡:一个做公网用,一个做内网用。内网网络是一个私有的,通常使用交换机或者多个交换机,并且只有这个集群中的节点才能访问。

Note:

Oracle does not support using crossover cables as Oracle Clusterware interconnects.
Oracle不支持使用交叉线来作为集群的内网

Cluster size is determined by the requirements of the workload running on the cluster and the number of nodes that you have configured in the cluster. If you are implementing a cluster for high availability, then configure redundancy for all of the components of the infrastructure as follows:

集群的大小是由运行在集群上的负载决定的,节点的数量你可以在集群中配置。如果你是为了高可用而实施集群。那么按下的说明为集群的组件配置冗余:

  • At least two network interfaces for the public network, bonded to provide one address

  • At least two network interfaces for the private interconnect network

    为公网提供至少两块网卡,绑定到一起提供一个地址

    为内网提供至少两块网卡

The cluster requires cluster-aware storageFoot 2  that is connected to each server in the cluster. This may also be referred to as a multihost device. Oracle Clusterware supports NFS, iSCSI, Direct Attached Storage (DAS), Storage Area Network (SAN) storage, and Network Attached Storage (NAS).

集群需要集群敏感的存储设备,存储需要可以连接到集群中的每个服务器。这也适用于多主机设备。Oracle集群支持NFS,iSCSI,DAS,SAN和NAS设备

To provide redundancy for storage, generally provide at least two connections from each server to the cluster-aware storage. There may be more connections depending on your I/O requirements. It is important to consider the I/O requirements of the entire cluster when choosing your storage subsystem.

为了为存储提供冗余,通常每个服务器需要提供至少两个能连接到存储的连接。也有可能需要更多,这取决于的你的IO需要。当你选择存储子系统的时候,为整个集群考虑下IO是非常重要的事情。

Most servers have at least one local disk that is internal to the server. Often, this disk is used for the operating system binaries; you can also use this disk for the Oracle software binaries. The benefit of each server having its own copy of the Oracle binaries is that it increases high availability, so that corruption to a one binary does not affect all of the nodes in the cluster simultaneously. It also allows rolling upgrades, which reduce downtime.

多数的服务器有至少一块本地磁盘是作为服务器自身用的。这些磁盘经常用来做操作系统用;你也可以使用这些磁盘来作为Oracle软件用。好处是每个服务器有自己的Oracle软件,可以提高可用性,那样即使坏了一个也不会同时影响整个集群。还可以允许撤销升级,减少宕机时间。

Oracle Clusterware Operating System Concepts and Requirements

Each server must have an operating system that is certified with the Oracle Clusterware version you are installing. Refer to the certification matrices available in the Oracle Grid Infrastructure Installation Guide for your platform or on My Oracle Support (formerly OracleMetaLink) for details, which are available from the following URL:

当你安装的时候好,每个服务器必须有一个和Oracle集群软件版本匹配的操作系统。匹配指标可以访问下面网址。

http://www.oracle.com/technetwork/database/clustering/tech-generic-unix-new-166583.html

When the operating system is installed and working, you can then install Oracle Clusterware to create the cluster. Oracle Clusterware is installed independently of Oracle Database. Once Oracle Clusterware is installed, you can then install Oracle Database or Oracle RAC on any of the nodes in the cluster.

当操作系统已经安装和工作,你可以安装Oracle集群软件来创建集群。Oracle集群软件是独立于Oracle数据库安装的。当Oracle集群软件安装后,你可以在每个节点上安装Oracle数据库或者是Oracle RAC

See Also:

Your platform-specific Oracle database installation documentation

Oracle Clusterware Software Concepts and Requirements

Oracle Clusterware uses voting disk files to provide fencing and cluster node membership determination. OCR provides cluster configuration information. You can place the Oracle Clusterware files on either Oracle ASM or on shared common disk storage. If you configure Oracle Clusterware on storage that does not provide file redundancy, then Oracle recommends that you configure multiple locations for OCR and voting disks. The voting disks and OCR are described as follows:

Oracle集群软件使用了表决磁盘文件来提供隔离和集群节点间关系的决策。OCR提供了集群配置信息。你可以把Oracle集群软件文件存放在或者是Oracle ASM磁盘上或者是共享的磁盘存储上。如果你配置了Oracle集群软件,但没有提供文件冗余,那么Oracle建议你为OCR和表决磁盘配置多路径。表决磁盘和OCR的概念如下:

  • Voting Disks

    Oracle Clusterware uses voting disk files to determine which nodes are members of a cluster. You can configure voting disks on Oracle ASM, or you can configure voting disks on shared storage.

    Oracle集群软件使用表决磁盘文件来绝对哪些节点是集群的成员。你可以在Oracle ASM上配置表决磁盘,或者你可以配置表决磁盘在共享存储上。

    If you configure voting disks on Oracle ASM, then you do not need to manually configure the voting disks. Depending on the redundancy of your disk group, an appropriate number of voting disks are created.

    如果你在ASM上配置表决磁盘,那么你不需要手动的配置表决磁盘。依照你的ASM磁盘组的冗余情况,一个合适数量的表决磁盘会被创建。

    If you do not configure voting disks on Oracle ASM, then for high availability, Oracle recommends that you have a minimum of three voting disks on physically separate storage. This avoids having a single point of failure. If you configure a single voting disk, then you must use external mirroring to provide redundancy.

    如果你不在ASM上配置表决磁盘,那么为了高可用性,Oracle建议你至少在3个物理独立的磁盘上创建表决磁盘。这样避免了单点故障。如果你只配置了1个表决磁盘,那么你必须使用外部镜像来提供冗余。

    You should have at least three voting disks, unless you have a storage device, such as a disk array that provides external redundancy. Oracle recommends that you do not use more than five voting disks. The maximum number of voting disks that is supported is 15.

    你应该至少有3块表决磁盘,除非你有一个存储设备,比如磁盘阵列,可以提供额外冗余。Oracle建议你不要超过5个表决磁盘,支持的最大数量是15.

  • Oracle Cluster Registry

    Oracle Clusterware uses the Oracle Cluster Registry (OCR) to store and manage information about the components that Oracle Clusterware controls, such as Oracle RAC databases, listeners, virtual IP addresses (VIPs), and services and any applications. OCR stores configuration information in a series of key-value pairs in a tree structure. To ensure cluster high availability, Oracle recommends that you define multiple OCR locations. In addition:

    Oracle集群使用OCR来存储和管理哪些有Oracle集群软件来控制的组件的信息。比如RAC数据库,listener,VIPs,services和其他应用。OCR存储这些配置信息是以一系列的key-value对的形式存在一个树形结构中的。为了保证集群的高可用,Oracle建议定义OCR的多路径。另外:

    • You can have up to five OCR locations

    • Each OCR location must reside on shared storage that is accessible by all of the nodes in the cluster

    • You can replace a failed OCR location online if it is not the only OCR location

    • You must update OCR through supported utilities such as Oracle Enterprise Manager, the Oracle Clusterware Control Utility (CRSCTL), the Server Control Utility (SRVCTL), the OCR configuration utility (OCRCONFIG), or the Database Configuration Assistant (DBCA)

      你可至多有5个OCR路径

      每个OCR路径必须存放在可以在集群中所有节点都可以访问的共享存储上。

      你可以在线的替换一个损坏的OCR路径,只要它不是唯一的OCR路径。

      你必须通过诸如OEM,CRSCTL,SRVCTL,OCRCONFIG,DBCA的方式升级OCR。

    See Also:

    Chapter 2, "Administering Oracle Clusterware" for more information about voting disks and OCR

Oracle Clusterware Network Configuration Concepts

Oracle Clusterware enables a dynamic Grid Infrastructure through the self-management of the network requirements for the cluster. Oracle Clusterware 11grelease 2 (11.2) supports the use of dynamic host configuration protocol (DHCP) for the VIP addresses and the SCAN address, but not the public address. DHCP provides dynamic configuration of the host's IP address, but it does not provide an optimal method of producing names that are useful to external clients.

Oracle集群软件可以使用通过为集群网络需要而自己管理的动态网格软件。Oracle集群软件11gR2支持使用DHCP来为VIP和SCAN分配地址,但是不支持公网地址。尽管DHCP可以动态配置主机IP地址,但是它不是一个供外部客户端使用的最佳命名方式。

When you are using Oracle RAC, all of the clients must be able to reach the database. This means that all the cluster's public addresses, the VIP and SCAN addresses, must be resolved by the clients. This problem is solved by the addition of the Oracle Grid Naming Service (GNS) to the cluster. GNS is linked to the corporate domain name service (DNS), so that clients can resolve these dynamic addresses and transparently connect to the cluster and the databases. Activating GNS in a cluster requires a DHCP service on the public network.

当你使用Oracle RAC,所有的客户端必须可以访问数据库。这意味着所有的集群公网地址,VIP,SCAN地址,必须被客户端解析。这个问题可以被Oracle Grid Naming Service(GNS)解决。GNS直接链接到DNS,这样客户端就可以解析这些动态地址,并且透明的连接到集群和数据库。在集群中启用GNS需要在公网中的DHCP服务。

Implementing GNS

To implement GNS, you must collaborate with your network administrator to obtain an IP address on the public network for the GNS VIP. DNS uses the GNS VIP to forward requests for access to the cluster to GNS. You must also collaborate with your DNS administrator to delegate a domain to the cluster. This can be a separate domain or a subdomain of an existing domain. The DNS server must be configured to forward all requests for this new domain to the GNS VIP. Since each cluster has its own GNS, it must be allocated a unique domain of which to be in control.

为了实施GNS,你必须和你的网络管理员合作获取一个公网的IP地址,这个地址是做GNS VIP用。DNS使用GNS VIP发送请求访问集群。你必须和你的DNS管理员合作来获取一个集群的代表域。这可以是一个独立的域,或者是一个已存在域的子域。DNS服务器必须配置成发送所有的请求到这个新域的GNS VIP。每个集群有了自己的GNS后,它必须分配一个唯一的域来被控制。

GNS and the GNS VIP run on one node in the cluster. The GNS daemon listens on the GNS VIP using port 53 for DNS requests. Oracle Clusterware manages the GNS and the GNS VIP to ensure that they are always available. If the server on which GNS is running fails, then Oracle Clusterware fails GNS over, along with the GNS VIP, to another node in the cluster.

GNS和GNS VIP运行在集群中的每个节点上。GNS守护进程使用53号端口为DNS请求监听GNS VIP。Oracle集群软件管理GNS和GNS VI来确保集群是可以访问的。如果GNS上的服务运行失败了。那么Oracle集群软件运行GNS失败,同时伴随这GNS VIP,集群中的其他节点也是如此。

With DHCP on the network, Oracle Clusterware obtains an IP address from the DHCP server along with other network information, such as what gateway to use, what DNS servers to use, what domain to use, and what NTP server to use. Oracle Clusterware initially obtains the necessary IP addresses during cluster configuration and it updates the Oracle Clusterware resources with the correct information obtained from the DHCP server, including the GNS.

通过网络中的DHCP服务,Oracle集群获得IP地址和其他网络信息,比如网关,DNS服务器,哪些域可以使用,还有哪些NTP服务器可以使用。Oracle集群软件最初在集群配置的时候获取必要的IP地址,它通过从DHCP服务器收集到的正确的信息会更新Oracle集群软件的资源,包括GNS。

Single Client Access Name (SCAN)

Oracle RAC 11g release 2 (11.2) introduces the Single Client Access Name (SCAN). SCAN is a domain name registered to at least one and up to three IP addresses, either in DNS or GNS. When using GNS and DHCP, Oracle Clusterware configures the VIP addresses for the SCAN name that is provided during cluster configuration.

Oracle RAC 11gR2介绍了SCAN。SCAN是一个域名,注册到了至少1个,最多3个IP地址,要么是DNS,要么是GNS。当使用GNs和DHCP时,Oracle集群软件在集群配置期间为SCAN配置VIP地址。

The node VIP and the three SCAN VIPs are obtained from the DHCP server when using GNS. If a new server joins the cluster, then Oracle Clusterware dynamically obtains the required VIP address from the DHCP server, updates the cluster resource, and makes the server accessible through GNS.

节点的VIP和三个SCAN VIP地址都在当你使用GNS的时候从DHCP服务器获取。如果新的服务器加入了集群。那么Oracle集群软件会动态的从DHCP服务器获取需要的VIP地址,更新集群资源,并且使得服务器通过GNS是可被访问的。

Example 1-1 shows the DNS entries that delegate a domain to the cluster.

例1-1展示了代表集群域的DNS条目

Example 1-1 DNS Entries

# Delegate to gns on mycluster
mycluster.example.com NS myclustergns.example.com
#Let the world know to go to the GNS vip
myclustergns.example.com. 10.9.8.7

See Also:

Oracle Grid Infrastructure Installation Guide for details about establishing resolution through DNS
Configuring Addresses Manually

Alternatively, you can choose manual address configuration, in which you configure the following:

除此之外,你也可以选择手动的地址配置,你可以按一下条目配置。

  • One public host name for each node.

  • One VIP address for each node.

    You must assign a VIP address to each node in the cluster. Each VIP address must be on the same subnet as the public IP address for the node and should be an address that is assigned a name in the DNS. Each VIP address must also be unused and unpingable from within the network before you install Oracle Clusterware.

  • Up to three SCAN addresses for the entire cluster.

    每个几点上都有公网主机名

    每个几点都有一个VIP地址

    你必须为集群中的每个节点分配一个VIP地址。每个VIP地址必须有一个公网的子网地址,并且这个地址名应该分配在DNS服务器中。每个VIP地址在你安装Oracle集群前必须是空置的,并且不能被ping通。

    整个集群最多3个SCAN地址

    Note:

    The SCAN must resolve to at least one address on the public network. For high availability and scalability, Oracle recommends that you configure the SCAN to resolve to three addresses.
    SCAN必须在公网中解析至少一个地址。为了高可用性和扩展性考虑,Oracle建议你配置3个SCAN解析地址。

See Also:

Your platform-specific  Oracle Grid Infrastructure Installation Guide installation documentation for information about system requirements and configuring network addresses

Overview of Oracle Clusterware Platform-Specific Software Components

When Oracle Clusterware is operational, several platform-specific processes or services run on each node in the cluster. This section describes these various processes and services.

当Oracle集群软件是可操作的,几个平台指定的进程或者服务会运行在集群的每个节点上。这部分就来概述这些进程和服务。

The Oracle Clusterware Stack

Oracle Clusterware consists of two separate stacks: an upper stack anchored by the Cluster Ready Services (CRS) daemon (crsd) and a lower stack anchored by the Oracle High Availability Services daemon (ohasd). These two stacks have several processes that facilitate cluster operations. The following sections describe these stacks in more detail:

Oracle集群软件堆栈;一个位置靠上的堆栈(被守护进程crsd固定)和一个位置靠下的堆栈(被守护进程ohasd固定)。这两个堆栈有几个进程来方便集群操作。以下提供堆栈更详细的细节。


The Cluster Ready Services Stack

The list in this section describes the processes that comprise CRS. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

这里列出的都是CRS相关的进程。包括了linux,UNIX上的进程,或者是windows上的服务

  • Cluster Ready Services (CRS): The primary program for managing high availability operations in a cluster.

    CRS:集群中管理高可用性操作的主要程序

    The CRS daemon (crsd) manages cluster resources based on the configuration information that is stored in OCR for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes. When you have Oracle RAC installed, the crsd process monitors the Oracle database instance, listener, and so on, and automatically restarts these components when a failure occurs.

    CRS的守护进程crsd管理存放在OCR中的配置信息包含的集群资源。包括启动,停止,监测和故障转移操作。crsd进程在资源切换的时候会生成events。当你安装了RAC,crsd进程监控数据库实例,监听等等,并在发生故障的时候自动重启这些组件。

  • Cluster Synchronization Services (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interface with your clusterware to manage node membership information.

    CSS:管理集群配置,控制集群中的成员并当有新成员加入/离开的时候进行通知。如果你在使用允许的第三方集群软件,你的集群的CSS进程会管理节点成员信息。

    The cssdagent process monitors the cluster and provides I/O fencing. This service formerly was provided by Oracle Process Monitor Daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure may result in Oracle Clusterware restarting the node.

    cssdagent进程监控集群并提供IO隔离。这个服务以前是有Oracle Process Monitor Daemon (oprocd)提供的,在windows上成为OraFenceService。cssdagent进程故障会 导致集群重启节点。

  • Oracle ASM: Provides disk management for Oracle Clusterware and Oracle Database.

    ASM:为集群和数据库提供磁盘管理

  • Cluster Time Synchronization Service (CTSS): Provides time management in a cluster for Oracle Clusterware.

    CTSS:为集群提供时间管理

  • Event Management (EVM): A background process that publishes events that Oracle Clusterware creates.

    EVM:一个后台进程发布集群创建的event

  • Oracle Notification Service (ONS): A publish and subscribe service for communicating Fast Application Notification (FAN) events.

    ONS:通信FAN事件的一个发布和订阅服务

  • Oracle Agent (oraagent): Extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g release 1 (11.1).

    oraagent:扩展集群来支持Oracle指定的需要和复杂性资源。这个进程在当FAN事件发生时运行服务端的标注脚本。这个进程在11gR1中叫做RACG

  • Oracle Root Agent (orarootagent): A specialized oraagent process that helps crsd manage resources owned by root, such as the network, and the Grid virtual IP address.

    orarootagent:一个特质的oraagent进程,用来帮助crsd管理root拥有的资源,比如网络,和Grid 虚拟ip地址。

The Cluster Synchronization Service (CSS), Event Management (EVM), and Oracle Notification Services (ONS) components communicate with other cluster component layers on other nodes in the same cluster database environment. These components are also the main communication links between Oracle Database, applications, and the Oracle Clusterware high availability components. In addition, these background processes monitor and manage database operations.

CSS,EVM,ONS这3个组件在同一个集群数据库环境中和其他节点的集群组件进行通信。这些组件也是数据库,应用,还有其他集群高可用组件之间的主要的通信链路。另外,这些后台程序监控和管理数据库操作。

The Oracle High Availability Services Stack

This section describes the processes that comprise the Oracle High Availability Services stack. The list includes components that are processes on Linux and UNIX operating systems, or services on Windows.

这部分概述了包含Oracle高可用服务堆栈的进程。包括了linux,UNIX上的进程,或者是windows上的服务

  • Cluster Logger Service (ologgerd): Receives information from all the nodes in the cluster and persists in a CHM repository-based database. This service runs on only two nodes in a cluster.

    ologgerd:从集群的所有节点上收集信息,并且保存在基于仓库的CHM数据库上。这个服务只会运行在集群的两个节点上。

  • System Monitor Service (osysmond): The monitoring and operating system metric collection service that sends the data to the cluster logger service. This service runs on every node in a cluster.

    osysmond:监控和操作系统的度量收集的服务进程,发送数据到集群的logger进程。这个服务运行在集群的每个节点上。

  • Grid Plug and Play (GPNPD): Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

    GPNPD:提供对Grid Plug和Play profile的访问。协调更新每个节点上的profile,确保所有的节点使用最新的profile

  • Grid Interprocess Communication (GIPC): A support daemon that enables Redundant Interconnect Usage.

    GIPC:一个支持进程,用来开启Redundant Interconnect Usage

  • Multicast Domain Name Service (mDNS): Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX and on Windows.

    mDNS:Grid Plug和Play用来定位集群中的profile,同时也被GNS用来名称解析。mDNS进程是一个在Linux,Unix和Windows上的后台进程。

  • Oracle Grid Naming Service (GNS): Handles requests sent by external DNS servers, performing name resolution for names defined by the cluster.

    GNS:处理有外部的DNS服务器发送过来的请求。执行集群中定义的名称解析

Table 1-1 lists the processes and services associated with Oracle Clusterware components. In Table 1-1, if a UNIX or a Linux system process has an (r) beside it, then the process runs as the root user.

表1-1列出了Oracle集群组件中的进程和服务,在表1-1中,如果UNIX或者Linux进程有一个(r)在旁边,那么表示这个进程是由root用户执行的。

Table 1-1 List of Processes and Services Associated with Oracle Clusterware Components

Oracle Clusterware ComponentLinux/UNIX ProcessWindows ServicesWindows Processes

CRS

crsd.bin (r)

OracleOHService

crsd.exe

CSS

ocssd.bincssdmonitorcssdagent

OracleOHService

cssdagent.execssdmonitor.exeocssd.exe

CTSS

octssd.bin (r)

 

octssd.exe

EVM

evmd.binevmlogger.bin

OracleOHService

evmd.exe

GIPC

gipcd.bin

 

 

GNS

gnsd (r)

 

gnsd.exe

Grid Plug and Play

gpnpd.bin

OracleOHService

gpnpd.exe

LOGGER

ologgerd.bin (r)

 

ologgerd.exe

Master Diskmon

diskmon.bin

 

 

mDNS

mdnsd.bin

 

mDNSResponder.exe

Oracle agent

oraagent.bin (11.2), or racgmainand racgimon (11.1)

 

oraagent.exe

Oracle High Availability Services

ohasd.bin (r)

OracleOHService

ohasd.exe

ONS

ons

 

ons.exe

Oracle root agent

orarootagent (r)

 

orarootagent.exe

SYSMON

osysmond.bin (r)

 

osysmond.exe


See Also:

"Clusterware Log Files and the Unified Log Directory Structure" for information about the location of log files created for processes

Note:

Oracle Clusterware on Linux platforms can have multiple threads that appear as separate processes with unique process identifiers.
Linux平台上的Oracle集群会有同一个进程号却有多个线程的情况。

Figure 1-2 illustrates cluster startup.

Figure 1-2 Cluster Startup

Description of Figure 1-2 follows
Description of "Figure 1-2 Cluster Startup"

Oracle Clusterware Processes on Windows Systems

Oracle Clusterware processes on Microsoft Windows systems include the following:

Oracle集群在Windows系统上的进程包括如下:

  • mDNSResponder.exe: Manages name resolution and service discovery within attached subnets

    mDNSResponder:管理附加的子网发现的名称解析和服务

  • OracleOHService: Starts all of the Oracle Clusterware daemons

    OracleOHService:所有Oracle集群守护进程的启动

Overview of Installing Oracle Clusterware

The following section introduces the installation processes for Oracle Clusterware.

下面这部分介绍安装Oracle集群的过程。

Note:

Install Oracle Clusterware with the Oracle Universal Installer .

Oracle Clusterware Version Compatibility

You can install different releases of Oracle Clusterware, Oracle ASM, and Oracle Database on your cluster. Follow these guidelines when installing different releases of software on your cluster:

你可以在你的集群中安装不同版本的Oracle集群软件,ASM,和数据库。如果是这样,请你安装以下的指南安装。

  • You can only have one installation of Oracle Clusterware running in a cluster, and it must be installed into its own home (Grid_home). The release of Oracle Clusterware that you use must be equal to or higher than the Oracle ASM and Oracle RAC versions that are running in the cluster. You cannot install a version of Oracle RAC that was released after the version of Oracle Clusterware that you run on the cluster. In other words:

    你可以在集群中只安装一个运行的集群软件,并且它必须安装在它自己的家目录下(Grid_home)。你使用的集群的版本必须等于或者高于ASM和RAC版本。你不能在你的集群上安装一个版本高于集群软件版本的RAC数据库,换言之:

    • Oracle Clusterware 11g release 2 (11.2) supports Oracle ASM release 11.2 only, because Oracle ASM is in the Grid Infrastructure home, which also includes Oracle Clusterware

      Oracle集群软件11gR2只支持Oracle ASM 11.2,因为Oracle ASM在网格Infrastructure 软件中,也包括了Oracle Clusterware 软件。

    • Oracle Clusterware release 11.2 supports Oracle Database 11g release 2 (11.2), release 1 (11.1), Oracle Database 10g release 2 (10.2), and release 1 (10.1)

      Oracle Clusterware 11.2支持Oracle Database 11.2,11.1,10.2,10.1。

    • Oracle ASM release 11.2 requires Oracle Clusterware release 11.2 and supports Oracle Database 11g release 2 (11.2), release 1 (11.1), Oracle Database 10g release 2 (10.2), and release 1 (10.1)

      Oracle ASM 11.2需要Oracle Clusterware 11.2,并且支持Oracle Database11.2,11.1,10.2,10.1。

    • Oracle Database 11g release 2 (11.2) requires Oracle Clusterware 11g release 2 (11.2)

      Oracle Database11.2需要 Oracle Clusterware 11.2

      For example:

      • If you have Oracle Clusterware 11g release 2 (11.2) installed as your clusterware, then you can have an Oracle Database 10g release 1 (10.1) single-instance database running on one node, and separate Oracle Real Application Clusters 10g release 1 (10.1), release 2 (10.2), and Oracle Real Application Clusters 11g release 1 (11.1) databases also running on the cluster. However, you cannot have Oracle Clusterware 10g release 2 (10.2) installed on your cluster, and install Oracle Real Application Clusters 11g. You can install Oracle Database 11g single-instance on a node in an Oracle Clusterware 10g release 2 (10.2) cluster.

        如果你已经安装了Oracle Clusterware 11.2作为你的集群。那么你可以有一个运行在一个节点上的Oracle Database 10.1单实例数据库,或者是10.2,11.1,11.2。然而,你不能在Clusterware 10.2上,安装RAC 11g。你可以在安装了Clusterware 10.2的 节点上安装Oracle Database 11g 单实例

      • When using different Oracle ASM and Oracle Database releases, the functionality of each is dependent on the functionality of the earlier software release. Thus, if you install Oracle Clusterware 11g and you later configure Oracle ASM, and you use Oracle Clusterware to support an existing Oracle Database 10g release 10.2.0.3 installation, then the Oracle ASM functionality is equivalent only to that available in the 10.2 release version. Set the compatible attributes of a disk group to the appropriate release of software in use.

        当使用不同的ASM和Database版本,每个的功能都是相对于以前的软件版本都是独立的。那么,如果你安装了Clusterware 11g,然后你之后配置了ASM,并且你使用Clusterware来支持已经存在的数据库10.2.0.3,那么ASM的版本需要和10.2版本相对可用。设置磁盘组的适用性参数来满足合适的版本软件使用。

        See Also:

        Oracle Automatic Storage Management Administrator's Guide for information about compatible attributes of disk groups
  • There can be multiple Oracle homes for the Oracle database (both single instance and Oracle RAC) in the cluster. The Oracle homes for all nodes of an Oracle RAC database must be the same.

    在集群中Oracle数据库可能有多个Oracle目录(包括单实例和RAC)。但在一个RAC数据库中ORacle目录必须是相同的。

  • You can use different users for the Oracle Clusterware and Oracle database homes if they belong to the same primary group.

    你可以使用不同的用户来管理Clusterware和Database目录,当然如果他们是属于同一个主组的话。

  • As of Oracle Clusterware 11g release 2 (11.2), there can only be one installation of Oracle ASM running in a cluster. Oracle ASM is always the same version as Oracle Clusterware, which must be the same (or higher) release than that of the Oracle database.

    在ORacle Clusterware 11.2中,只需要安装一个ASM在集群中运行。ASM经常和Clusterware是同一个版本,而且是必须等于(或者高于)Database的版本。

  • For Oracle RAC running Oracle9i you must run an Oracle9i cluster. For UNIX systems, that is HACMP, Serviceguard, Sun Cluster, or Veritas SF. For Windows and Linux systems, that is the Oracle Cluster Manager. To install Oracle RAC 10g, you must also install Oracle Clusterware.

    对于运行在9i上的RAC,你必须运行9i的集群。在UNIX上,就是HACMP,Serviceguard,Sun Cluster,或者是Veritas SF。在Windows和Linux上,那就是Oracle Cluster Manager。为了安装RAC 10g,你必须安装ORacle Clusterware

  • You cannot install Oracle9i RAC on an Oracle Database 10g cluster. If you have an Oracle9i RAC cluster, you can add Oracle RAC 10g to the cluster. However, when you install Oracle Clusterware 10g, you can no longer install any new Oracle9i RAC databases.

    你不能在10g cluster上安装9i RAC。如果你有一个9i RAC 集群。那么你可以添加 RAC 10g 到这个集群中。然而,当你安装Clusterware 10g,你就不能再安装新的9i RAC Database了。

  • Oracle recommends that you do not run different cluster software on the same servers unless they are certified to work together. However, if you are adding Oracle RAC to servers that are part of a cluster, either migrate to Oracle Clusterware or ensure that:

    Oracle建议你不要运行不同版本的集群软件在一个服务器上,除非他们经过验证是可以共存的。然而,如果你添加RAC到服务器上做为集群的一部分,那么要么迁移到Clusterware,要么确保如下:

    • The clusterware you run is supported to run with Oracle RAC 11g release 2 (11.2).

    • You have installed the correct options for Oracle Clusterware and the other vendor clusterware to work together.

      你运行的集群必须支持RAC 11.2

      你已经安装了Clusterware的正确选项,并且其他供应商的集群软件可以共存。

See Also:

Oracle Grid Infrastructure Installation Guide for more version compatibility information

Overview of Upgrading Oracle Clusterware

Oracle supports in-place and out-of-place upgrades. Both strategies facilitate rolling upgrades. For Oracle Clusterware 11g release 2 (11.2), in-place upgrades are supported for patches only. Patch bundles and one-off patches are supported for in-place upgrades but patch sets and major point releases are supported for out-of-place upgrades only.

Oracle支持原地和异地升级。这两种策略都可以回滚升级。比如Clusterware 11.2,原地升级仅支持补丁。多补丁和一次性补丁都支持原地升级,但是不定级和主要点版本金支持异地升级。

An in-place upgrade replaces the Oracle Clusterware software with the newer version in the same Grid home. Out-of-place upgrade has both versions of the same software present on the nodes at the same time, in different Grid homes, but only one version is active.

原地升级在同一个Grid目录下使用了新版本的软件替换了老版本。异地升级有同一个软件版本在同一个时间,在不同的Grid目录有不同的代表。但只有一个版本是激活状态。

Rolling upgrades avoid downtime and ensure continuous availability of Oracle Clusterware while the software is upgraded to the new version. When you upgrade to 11g release 2 (11.2), Oracle Clusterware and Oracle ASM binaries are installed as a single binary called the Grid Infrastructure. You can upgrade Oracle Clusterware in a rolling manner from Oracle Clusterware 10g and Oracle Clusterware 11g, however you can only upgrade Oracle ASM in a rolling manner from Oracle Database 11g release 1 (11.1).

当软件升级到新版本的时候回滚避免停机并且确保集群的连续可用性。当你升级到11gR2,Clusterware和ASM都被安装为一个二进制文件叫做Grid Infrastructure 网格套件。你可以以滚动的方式从10g和11g升级Clusterware,然而你可以以滚动的方式只从11.1升级ASM

Oracle supports force upgrades in cases where some nodes of the cluster are down.

ORacle支持在一些节点停掉的情况下强制升级。

See Also:

Oracle Grid Infrastructure Installation Guide for more information about upgrading Oracle Clusterware

Overview of Managing Oracle Clusterware Environments

The following list describes the tools and utilities for managing your Oracle Clusterware environment:

下面列出了一些管理Oracle集群环境的实用工具。

  • Oracle Enterprise Manager: Oracle Enterprise Manager has both the Database Control and Grid Control GUI interfaces for managing both single instance and Oracle RAC database environments. It also has GUI interfaces to manage Oracle Clusterware and all components configured in the Oracle Grid Infrastructure installation. Oracle recommends that you use Oracle Enterprise Manager to perform administrative tasks.

    OEM:OEM拥有数据库和网格管理的图形化界面,可以控制单实例和RAC数据库环境。它也有管理Oracle集群的图形化界面,和所有配置在Grid软件中的组件。Oracle建议你实用OEM来执行管理任务。

    See Also:

    Oracle Database 2 Day + Real Application Clusters GuideOracle Real Application Clusters Administration and Deployment Guide, and Oracle Enterprise Manager online documentation for more information about administering Oracle Clusterware with Oracle Enterprise Manager
  • Cluster Verification Utility (CVU): CVU is a command-line utility that you use to verify a range of cluster and Oracle RAC specific components. Use CVU to verify shared storage devices, networking configurations, system requirements, and Oracle Clusterware, and operating system groups and users.

    CVU:CVU是一个命令行工具,你可以用来验证一部分集群和RAC特定的组件。实用CVU可以验证共享存储设备,网络配置,系统请求,和Oracle集群,操作系统用户和组。

    Install and use CVU for both preinstallation and postinstallation checks of your cluster environment. CVU is especially useful during preinstallation and during installation of Oracle Clusterware and Oracle RAC components to ensure that your configuration meets the minimum installation requirements. Also use CVU to verify your configuration after completing administrative tasks, such as node additions and node deletions.

    在安装前和安装后都可以安装和使用CVU来检查你的集群环境。CVU在ORacle的Clusterware和RAC安装前和安装过程中是非常有用的,这样可以确保你的配置满足最小安装要求。在完成管理任务后,比如节点添加和删除,也可以使用CVU来验证你的配置。

    See Also:

    Your platform-specific Oracle Clusterware and Oracle RAC installation guide for information about how to manually install CVU, and  Appendix A, "Cluster Verification Utility Reference" for more information about using CVU
  • Server Control (SRVCTL): SRVCTL is a command-line interface that you can use to manage Oracle resources, such as databases, services, or listeners in the cluster.

    SRVCTL是一个命令行工具,你可以使用来管理集群中的Oracle资源,比如数据库,服务,或者是监听。

    Note:

    You can only manage server pools that have names prefixed with  ora.* by using SRVCTL.
    你只可以使用SRVCTL来管理哪些ora.开头的服务池

    See Also:

    Server Control Utility reference appendix in the  Oracle Real Application Clusters Administration and Deployment Guide
  • Oracle Clusterware Control (CRSCTL): CRSCTL is a command-line tool that you can use to manage Oracle Clusterware. CRSCTL should be used for general clusterware management and management of individual resources.

    CRSCTL是一个命令行工具,你可以使用来管理Oracle集群。CRSCTL应该被用做普通的集群管理和独立资源的管理工具。

    Oracle Clusterware 11g release 2 (11.2) introduces cluster-aware commands with which you can perform operations from any node in the cluster on another node in the cluster, or on all nodes in the cluster, depending on the operation.

    Oracle Clusterware 11gR2介绍了一些集群专用的命令,你可以在任意节点上运行这些操作命令,或者可以操作整个集群,这都视你的操作而定。

    You can use crsctl commands to monitor cluster resources (crsctl status resource) and to monitor and manage servers and server pools other than server pools that have names prefixed with ora.*, such as crsctl status servercrsctl status serverpoolcrsctl modify serverpool, and crsctl relocate server. You can also manage Oracle High Availability Services on the entire cluster (crsctl start | stop | enable | disable | config crs), using the optional node-specific arguments -n or -all. You also can use CRSCTL to manage Oracle Clusterware on individual nodes (crsctl start | stop | enable | disable | config crs).

    你可以使用crsctl命令来监控集群资源(crsctl status resource)并且用比如(crsctl status server,crsctl status serverpool,crsctl modify serverpool,crsctl relocate server)监控和管理服务和服务池而不是带ora.前缀的服务池。你也可以在整个集群上使用Oracle高可用服务(crsctl start|stop|enable|disable|config crs),使用可选的节点指定参数-n或者-all。你也可以使用CRSCTL来管理某个独立节点上的Oracle集群软件(crsctl start|stop|enable|disable|config crs)。

    See Also:

  • Oracle Interface Configuration Tool (OIFCFG): OIFCFG is a command-line tool for both single-instance Oracle databases and Oracle RAC environments. Use OIFCFG to allocate and deallocate network interfaces to components. You can also use OIFCFG to direct components to use specific network interfaces and to retrieve component configuration information.

    OIFCFG是一个在单实例和RAC环境都可以使用的命令行工具。使用OIFCFG来分配和撤销组件的网络接口。你可以使用OIFCFG来引导组件使用指定的网络接口接收组件的配置信息。

  • Oracle Cluster Registry Configuration Tool (OCRCONFIG): OCRCONFIG is a command-line tool for OCR administration. You can also use the OCRCHECK and OCRDUMP utilities to troubleshoot configuration problems that affect OCR.

    OCRCONFIG是一个用作OCR管理的命令行工具。你可以使用OCRDUMP工具来排查哪些影响OCR的配置问题。

    See Also:

    Chapter 2, "Administering Oracle Clusterware" for more information about managing OCR
  • Cluster Health Monitor (CHM)CHM detects and analyzes operating system and cluster resource-related degradation and failures to provide more details to users for many Oracle Clusterware and Oracle RAC issues, such as node eviction. The tool continuously tracks the operating system resource consumption at the node, process, and device levels. It collects and analyzes the clusterwide data. In real-time mode, when thresholds are met, the tool shows an alert to the user. For root-cause analysis, historical data can be replayed to understand what was happening at the time of failure.

    CHM侦测和分析操作系统和集群资源相关的降级和故障,为集群和RAC提供很多有用的细节,比如eviction(中文意思:驱逐)。工具连续的跟踪节点/进程/设备级别上的操作系统资源消耗。它收集和分析了集群范围的数据。在实时模式下,当触发了阈值,工具就会显示一个警告给用户。对于一些root引起的分析,历史数据可以被重演,以便来理解故障的时候发生了什么。

    See Also:

    "Cluster Health Monitor" for more information about CHM

Overview of Cloning and Extending Oracle Clusterware in Grid Environments

Cloning nodes is the preferred method of creating new clusters. The cloning process copies Oracle Clusterware software images to other nodes that have similar hardware and software. Use cloning to quickly create several clusters of the same configuration. Before using cloning, you must install an Oracle Clusterware home successfully on at least one node using the instructions in your platform-specific Oracle Clusterware installation guide.

克隆节点是创建一个新节点的首选方式。克隆进程赋值集群软件的镜像到另外一个拥有相似硬件和软件的节点。建议使用克隆来快速创建多个同配置的集群。在使用克隆之前,你必须在至少一个节点上安装数据库主目录,安装时建议使用和你平台相关的安装指南。

For new installations, or if you must install on only one cluster, Oracle recommends that you use the automated and interactive installation methods, such as Oracle Universal Installer or the Provisioning Pack feature of Oracle Enterprise Manager. These methods perform installation checks to ensure a successful installation. To add or delete Oracle Clusterware from nodes in the cluster, use the addNode.sh and rootcrs.pl scripts.

对于新安装,或者你必须安装唯一一个集群。Oracle建议你使用自动化和交互式的安装方式,比如Oracle Universal Installer或者是OEM的Provisiong Pack 特性。这些方法执行安装检查来确保可以成功安装。为了添加和删除Oracle集群,使用addNode.sh和rootcrs.pl脚本。

See Also:

Overview of the Oracle Clusterware High Availability Framework and APIs

Oracle Clusterware provides many high availability application programming interfaces called CLSCRS APIs that you use to enable Oracle Clusterware to manage applications or processes that run in a cluster. The CLSCRS APIs enable you to provide high availability for all of your applications.

Oracle集群提供许多高可用应用程序接口称为CLSCRS APIs,那样你就可以使用Clusterware来管理集群中运行的应用或者是进程。CLSCRS APIs允许为你你应用提供高可用。

See Also:

Appendix F, "Oracle Clusterware C Application Program Interfaces" for more detailed information about the CLSCRS APIs

You can define a VIP address for an application to enable users to access the application independently of the node in the cluster on which the application is running. This is referred to as the application VIP. You can define multiple application VIPs, with generally one application VIP defined for each application running. The application VIP is related to the application by making it dependent on the application resource defined by Oracle Clusterware.

你可以为一个应用定义一个VIP地址,让用户可以独立访问节点上的正在运行的应用。这个指的就是应用程序VIP。你可以定义多个应用程序VIP,但通常,一个应用程序配置一个应用程序VIP。

To maintain high availability, Oracle Clusterware components can respond to status changes to restart applications and processes according to defined high availability rules. You can use the Oracle Clusterware high availability framework by registering your applications with Oracle Clusterware and configuring the clusterware to start, stop, or relocate your application processes. That is, you can make custom applications highly available by using Oracle Clusterware to create profiles that monitor, relocate, and restart your applications.

为了维护高可用性,ORacle集群组件可以反馈状态的变更来重启应用和进程,按照预先定义好的高可用性规则。你可以通过注册你的应用程序来使用Oracle集群软件高可用性框架,并且配置集群来启动,停止或者重新分配应用程序进程。也就是说,你可以定制应用的高可用,通过用Oracle Clusterware创建profile来监控,重分配,和重启你的应用程序。


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值