11gR2 RAC connection management

对于RAC来说,当Oracle Instance失败的,应用有几种高可用技术来保证可用性,这些技术对客户端来说都是透明的,客户端可能感知不到instance failure。
一.Transparent Application Failover (TAF)
二.Fast Connection Failover (FCF).
当一个节点failure的时候,TAF允许数据库session使用OCI libraries进行fail over到其他存活的一个节点。但是如果应用使用JDBC thin driver则不能使用TAF。
Failover Modes:
Session failover
Select failover
None(default)
1.select mode允许query重新在新的节点执行,并且丢弃已经fetch的行,继续返回结果给客户端。
2.none mode显式的声明不使用TAF
3.TAF不能够恢复任何DML事务,事务将在另一个节点回滚。
Failover Methods:
Basic
Preconnect
1.Basic选项意味着客户端在实例失败后重新建立新的连接,这个method可能导致节点失败时,另一节点的性能下降,因为有很多的session将要重新连接到幸存的节点。
2.客户端产生一个preconnected session作为备份,在实例失败时加速failover。
TAF是client-side feature,我们需要配置它在客户端Tnsnames.ora或者server端通过service来实现
1,客户端实现

RAC中在tnsnames.ora里面典型配置如下:

LEO1 =
  (DESCRIPTION_LIST =
     (LOAD_BALANCE = off)
     (FAILOVER = on)
     (DESCRIPTION =
        (ADDRESS_LIST =
           (LOAD_BALANCE=OFF)
           (FAILOVER=ON)
           (ADDRESS = (PROTOCOL = TCP)(HOST = node1.chinamobile.com)(PORT = 1521))
        )
        (CONNECT_DATA =
           (SERVICE_NAME = leo)
           (INSTANCE_NAME = leo1)
           (FAILOVER_MODE=(TYPE=session)(METHOD=basic)(RETRIES=4)(DELAY=1))
         )
      )
     (DESCRIPTION =
        (ADDRESS_LIST =
           (LOAD_BALANCE=OFF)
           (FAILOVER=ON)
           (ADDRESS = (PROTOCOL = TCP)(HOST = node2.chinamobile.com)(PORT = 1521))
         )
        (CONNECT_DATA =
           (SERVICE_NAME = leo)
           (INSTANCE_NAME = leo2)
           (FAILOVER_MODE=(TYPE=session)(METHOD=basic)(RETRIES=4)(DELAY=1))
         )
       )
  )
2,server端实现
srvctl add service -d leo -s cmcc -q TRUE -P BASIC -e SESSION -z 4 -w 5  具体使用参数如下:
Usage: srvctl add service -d <db_unique_name> -s <service_name> {-r "<preferred_list>" [-a "<available_list>"] [-P {BASIC | NONE | PRECONNECT}] 
| -g <pool_name> [-c {UNIFORM | SINGLETON}] } [-k   <net_num>] [-l [PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] 
[-y {AUTOMATIC | MANUAL}] [-q {TRUE|FALSE}] [-x {TRUE|FALSE}] [-j {SHORT|LONG}] [-B {NONE|SERVICE_TIME|THROUGHPUT}] 
[-e {NONE|SESSION|SELECT}] [-m {NONE|BASIC}] [-z <failover_retries>] [-w <failover_delay>] [-t <edition>] [-f]
    -d <db_unique_name>      Unique name for the database
    -s <service>             Service name
    -r "<preferred_list>"    Comma separated list of preferred instances
    -a "<available_list>"    Comma separated list of available instances
    -g <pool_name>           Server pool name
    -c {UNIFORM | SINGLETON} Service runs on every active server in the server pool hosting this service (UNIFORM) or just one server (SINGLETON)
    -k <net_num>             network number (default number is 1)
    -P {NONE | BASIC | PRECONNECT}        TAF policy specification
    -l <role>                Role of the service (primary, physical_standby, logical_standby, snapshot_standby)
    -y <policy>              Management policy for the service (AUTOMATIC or MANUAL)
    -e <Failover type>       Failover type (NONE, SESSION, or SELECT)
    -m <Failover method>     Failover method (NONE or BASIC)
    -w <integer>             Failover delay
    -z <integer>             Failover retries
    -t <edition>             Edition (or "" for empty edition value)
    -j <clb_goal>  Connection Load Balancing Goal (SHORT or LONG). Default is LONG.
    -B <Runtime Load Balancing Goal>     Runtime Load Balancing Goal (SERVICE_TIME, THROUGHPUT, or NONE)
    -x <Distributed Transaction Processing>  Distributed Transaction Processing (TRUE or FALSE)
    -q <AQ HA notifications> AQ HA notifications (TRUE or FALSE)
Usage: srvctl add service -d <db_unique_name> -s <service_name> -u {-r "<new_pref_inst>" | -a "<new_avail_inst>"} [-f]
    -d <db_unique_name>      Unique name for the database
    -s <service>             Service name
    -u                       Add a new instance to service configuration
    -r <new_pref_inst>       Name of new preferred instance
    -a <new_avail_inst>      Name of new available instance
    -f                       Force the add operation even though a listener is not configured for a network
    -h                       Print usage
Fast Connection Failover and Fast Application Notification
FAN是一种消息发布机制,在instance up/down发布消息到客户端,来进行群集的重新配置,FCF是通过FAN来透明的进行
instance failure failover。
FCF支持的客户端驱动:
Java Database Connection Driver (JDBC)
Oracle Universal Connection Pool UCP
Clients using the Oracle Call Interface
Oracle Data Providers for .Net
在11R2(11.2.0.1)中当设置了db_domain参数,FCF将会失效,bug:8779597

下面看看11gR2的情况:
11gR2出现了scan,相应的listener配置也有所变化:
$GRID_HOME/network/admin目录下:
-rw-r--r-- 1 grid oinstall 695 May 1 16:43 listener.ora
-rw-r--r-- 1 grid oinstall 369 May 1 16:43 endpoints_listener.ora

[grid@node2 admin]$ cat listener.ora
LISTENER2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER2))))          # line added by Agent
# listener.ora.node2 Network Configuration File: /u01/app/11.2.0.3/grid/network/admin/listener.ora.node2
# Generated by Oracle configuration tools.

ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER_SCAN1 = ON
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER1 = ON
LISTENER =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER))
    )
  )
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER = ON
LISTENER_SCAN1 =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = IPC)(KEY = LISTENER_SCAN1))
    )
  )
ENABLE_GLOBAL_DYNAMIC_ENDPOINT_LISTENER2=ON             # line added by Agent

[grid@node2 admin]$ cat endpoints_listener.ora
LISTENER2_NODE2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node2-vip)(PORT=1522))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.103)(PORT=1522)(IP=FIRST))))              # line added by Agent
LISTENER_NODE2=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=node2-vip)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.103)(PORT=1521)(IP=FIRST))))               # line added by Agent
查看scan ip的配置和状态:
[grid@node2 admin]$ srvctl config scan
SCAN name: scanip, Network: 1/10.180.69.0/255.255.255.0/eth0
SCAN VIP name: scan1, IP: /scanip/10.180.69.105
[grid@node2 admin]$ srvctl status scan
SCAN VIP scan1 is enabled
SCAN VIP scan1 is running on node node2
[grid@node2 admin]$ srvctl status scan_listener
SCAN Listener LISTENER_SCAN1 is enabled
SCAN listener LISTENER_SCAN1 is running on node node2
验证实例注册:
grid@node2 admin]$ lsnrctl status listener_scan1
LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 09-JUN-2012 15:04:39
Copyright (c) 1991, 2011, Oracle.  All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN1
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                07-JUN-2012 19:30:34
Uptime                    1 days 19 hr. 34 min. 5 sec
Trace Level               support
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/11.2.0.3/grid/network/admin/listener.ora
Listener Log File         /u01/app/11.2.0.3/grid/log/diag/tnslsnr/node2/listener_scan1/alert/log.xml
Listener Trace File       /u01/app/11.2.0.3/grid/log/diag/tnslsnr/node2/listener_scan1/trace/ora_6446_3050497728.trc
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=10.180.69.105)(PORT=1521)))
Services Summary...
Service "leo" has 2 instance(s).
  Instance "leo1", status READY, has 2 handler(s) for this service...
  Instance "leo2", status READY, has 2 handler(s) for this service...
Service "leoXDB" has 1 instance(s).
  Instance "leo1", status READY, has 1 handler(s) for this service...
  Instance "leo2", status READY, has 1 handler(s) for this service...
The command completed successfully
监听listener_scan1运行在node2,两个实例是被注册的,群集内的每个节点实例通过remote listener来注册自身。
SQL> show parameter list
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
local_listener                       string      (DESCRIPTION=(ADDRESS_LIST=(AD
                                                 DRESS=(PROTOCOL=TCP)(HOST=10.1
                                                 80.69.104)(PORT=1521))(ADDRESS
                                                 =(PROTOCOL=TCP)(HOST=10.180.69
                                                 .104)(PORT=1522))))
remote_listener                      string      scanip:1521
这里因为我添加另外一个本地监听,所以local_listener有两个。
11gR2中默认配置客户端通过如下:
LEO = 
(DESCRIPTION = 
    (ADDRESS = (PROTOCOL = TCP)(HOST = scanip)(PORT = 1521)) 
  (CONNECT_DATA = 
     (SERVER = DEDICATED) 
     (SERVICE_NAME = leo) 
  ) 
 )
现在我们来看看11gR2的TAF,之前提到TAF可以在客户端的tnsnames.ora里配置,也可以在server端通过添加service来配置。
那么客户端的tnsnames.ora来做如下配置:
LEO_TAF = 
 (DESCRIPTION = 
   (ADDRESS = (PROTOCOL = TCP)(HOST = scanip)(PORT = 1521)) 
 (CONNECT_DATA = 
   (SERVER = DEDICATED) 
   (SERVICE_NAME = leo) 
     (FAILOVER_MODE= 
       (TYPE=session) 
       (METHOD=basic))))      
可以通过如下查询来验证客户端TAF:
SELECT MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER, COUNT(*)
FROM V$SESSION
GROUP BY MACHINE, FAILOVER_TYPE, FAILOVER_METHOD, FAILED_OVER;

MACHINE               FAILOVER_TYPE FAILOVER_M FAI   COUNT(*)
--------------------  ------------- ---------- --- ----------
Thinkpad-leo           SESSION          BASIC       NO     5
如果是Server端的,来看看配置的过程:
1.create service(Use Oracel Account)
 ./srvctl add service -d leo -s taf -q TRUE -r leo2 -P BASIC -e SESSION -z 4 -w 5
2.start service
 srvctl start service -d leo -s taf
3.check service
 srvctl config service -d leo
 
Service name: taf
Service is enabled
Server pool: leo_taf
Cardinality: 1
Disconnect: false
Service role: PRIMARY
Management policy: AUTOMATIC
DTP transaction: false
AQ HA notifications: true
Failover type: SESSION
Failover method: BASIC
TAF failover retries: 4
TAF failover delay: 5
Connection Load Balancing Goal: LONG
Runtime Load Balancing Goal: NONE
TAF policy specification: BASIC
Edition: 
Preferred instances: leo2
Available instances: leo1

srvctl status service -d leo
Service taf is running on instance(s) leo2
4.查看信息
col failover_method format a11 heading 'METHOD' 
col failover_type format a10 heading 'TYPE' 
col failover_retries format 9999999 heading 'RETRIES' 
col goal format a10 
col clb_goal format a8 
col AQ_HA_NOTIFICATIONS format a5 heading 'AQNOT' 
select name, failover_method, failover_type, failover_retries,goal, clb_goal,aq_ha_notifications  
from dba_services where service_id = 3; 

NAME            METHOD      TYPE       RETRIES  GOAL       CLB_GOAL AQNOT 
--------------- ----------- ---------- -------- ---------- -------- ----- 
taf             BASIC       SESSION    4                   LONG      YES
5.查看监听
LSNRCTL> services
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
Services Summary...
Service "leo" has 1 instance(s).
  Instance "leo2", status READY, has 2 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.104)(PORT=1522)))
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.104)(PORT=1521)))
Service "leoXDB" has 1 instance(s).
  Instance "leo2", status READY, has 1 handler(s) for this service...
    Handler(s):
      "D000" established:0 refused:0 current:0 max:1022 state:ready
         DISPATCHER <machine: node2, pid: 10911>
         (ADDRESS=(PROTOCOL=tcp)(HOST=node2)(PORT=18689))
Service "taf" has 1 instance(s).
  Instance "leo2", status READY, has 2 handler(s) for this service...
    Handler(s):
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.104)(PORT=1522)))
      "DEDICATED" established:0 refused:0 state:ready
         REMOTE SERVER
         (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=10.180.69.104)(PORT=1521)))
The command completed successfully
Load Balance:
1.Client Side Load Balance
当客户端通过SCAN连接11gR2 RAC,load balance发生在连接阶段(可以通过sqlnet的trace来确认),Scan name被解析成最多三个不同的IP,
客户端随机的选择一个IP去连接。
对于非11gR2 client,是不能使用SCAN的好处,因为它不能处理DNS解析的SCAN ip,所以,在address_list里,它将
尝试用第一个address去连接,不会使用其他的地址,那么如果这个ip地址不能被scan listener监听或者ip本事存在问题,连接就会失败。
使用静态的SCAN IP(未使用GNS)的tnsnames.ora的条目如下:
LEO_LB  =
    (DESCRIPTION = 
     (ADDRESS_LIST = 
      (LOAD_BALANCE=ON)
      (FAILOVER=ON)
      (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP1>)(PORT = 1521)) 
      (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP2>)(PORT = 1521)) 
      (ADDRESS = (PROTOCOL = TCP)(HOST = <scan-VIP3>)(PORT = 1521)) 
     )
    (CONNECT_DATA =
     (SERVER = DEDICATED)
     (SERVICE_NAME = leo)
     (FAILOVER_MODE=
      (TYPE=session)
      (METHOD=basic))))

10g对方式:

LEO =
(DESCRIPTION =
(LOAD_BALANCE = yes)
(ADDRESS = (PROTOCOL = TCP)(HOST = node1)
(PORT = 1522))
(ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)
(PORT = 1522))
(CONNECT_DATA =
(SERVICE_NAME = LEO)))
--JDBC--
url="jdbc:oracle:thin:@(DESCRIPTION=(LOAD_BALANCE=on)
(ADDRESS=(PROTOCOL=TCP)(HOST=node1-vip)(PORT=1521))
(ADDRESS=(PROTOCOL=TCP)(HOST=node2-vip)(PORT=1521))
(CONNECT_DATA=(SERVICE_NAME=LEO)))"


关于rac连接管理的详细文档,崔华写过一篇很详细,可以参阅:

http://www.oracle.com/technetwork/cn/articles/database-performance/oracle-rac-connection-mgmt-1650424-zhs.html
http://www.oracle.com/technetwork/database/clustering/overview/scan-129069.pdf

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值