一、什么是 LoadBalance
LoadBalance 就是把负载平均的分配到集群中的各个节点,从而提高整体的吞吐能力。Oracle RAC 提供了两种不同的方法来分散负载。
一种是通过Connection Balancing,按照某种算法把用户分配到不同的节点。也可认为是纯技术的分散负载。
另一种是通过Service,在应用层上进行分散,也可认为是根据业务的分散负载。
1.1、Connection Balancing:
Connection Balancing 这种负载均衡是在用户连接这个层次进行的,也就是在用户请求建立连接时,根据每个节点的负载决定把连接分配给哪个实例,而一旦连接建立之后,会话的所有操作就都在这个实例上完成,而不会再分派给其他节点了。
Connection Balancing 有客户端和服务端两种实现方法
1.1.1、客户端均衡(Client-Side LB)
主机IP地址是VIP,目的是为了使用VIP的特性:漂移,是Oracle 8i以后 使用的方法,配置方法是在客户端的tnsnames.ora 文件中加入:LOAD_BALANCE=YES 条目。如下:
在节点1的tnsname.ora中添加LOAD_BALANCE = on,而节点2不添加,看看能否实现负载均衡?
[root@or11g2 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
#public IP
192.168.127.87 or11g1 or11g1.oracle.com
192.168.127.89 or11g2 or11g2.oracle.com
#vip
192.168.127.88 or11g1-vip or11g1-vip.oracle.com
192.168.127.90 or11g2-vip or11g2-vip.oracle.com
OR11G1 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.88)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.90)(PORT = 1521))
(LOAD_BALANCE = on)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = or11g)
)
)
再看看节点2上没有添加:(LOAD_BALANCE = on)
OR11G2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.88)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.90)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = or11g)
)
)
在节点1和节点2上均以scott用户通过rac1连接字符串进行连接,查看个节点连接情况
节点1开了多个串口进行如下连接:
[oracle@or11g1 admin]$ sqlplus scott/oracle@or11g1
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 09:44:01 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL>
节点1查看数据库scott用户连接如下:
[oracle@or11g1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 09:45:39 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> set linesize 180
SQL> select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 134 23
SQL>
这是开了一个窗口,使用scott登陆的。再开几个窗口(一共开了四个窗口),同样使用scott登陆
[oracle@or11g1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 09:45:39 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> set linesize 180
SQL> select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 134 23
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 134 23
2 SCOTT 14 17
2 SCOTT 63 25
SQL>
开了四个窗口后,最后查询结果发现连接被均衡的分摊到两个实例上了。
接下来,在节点2上测试
[oracle@or11g2 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 09:53:02 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> set linesize 180
SQL> select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
no rows selected
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 76 579
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 76 579
1 SCOTT 134 31
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 76 579
1 SCOTT 134 31
1 SCOTT 137 27
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 76 579
1 SCOTT 134 31
1 SCOTT 137 27
1 SCOTT 200 3
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 15 17
1 SCOTT 76 579
1 SCOTT 134 31
1 SCOTT 137 27
1 SCOTT 200 3
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 15 17
1 SCOTT 76 579
1 SCOTT 77 25
1 SCOTT 134 31
1 SCOTT 137 27
1 SCOTT 200 3
6 rows selected.
SQL>
这是在节点2上开了6六个窗口,分别查看连接情况,结果显示全部在第一个实例上。
接下来,我们在对节点二的tnsname.ora文件添加LOAD_BALANCE = on参数,再来查看连接情况
OR11G2 =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.88)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.90)(PORT = 1521))
(LOAD_BALANCE = on)
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = or11g)
)
)
将节点二中六个scott用户的连接断开,然后再重新登录
1 SCOTT 137 27
1 SCOTT 200 3
6 rows selected.
SQL> select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
no rows selected
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
1 SCOTT 133 85
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 202 513
1 SCOTT 133 85
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 202 513
1 SCOTT 76 581
1 SCOTT 133 85
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 202 513
2 SCOTT 203 7
1 SCOTT 76 581
1 SCOTT 133 85
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 63 45
2 SCOTT 202 513
2 SCOTT 203 7
1 SCOTT 76 581
1 SCOTT 133 85
6 rows selected.
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 63 45
2 SCOTT 134 71
2 SCOTT 202 513
2 SCOTT 203 7
1 SCOTT 76 581
1 SCOTT 133 85
7 rows selected.
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 25
2 SCOTT 63 45
2 SCOTT 134 71
2 SCOTT 202 513
2 SCOTT 203 7
1 SCOTT 76 581
1 SCOTT 133 85
1 SCOTT 200 5
8 rows selected.
SQL>
从图中可以看出,连接被分摊到两个实例上了,但是并不是简单的1:1均衡分摊,应该是根据某种算法实现的。
**************************************************************************************************************************************************************************************************************************************************************************************************
1.1.2、服务器端均衡(Server-Side LB)
Server-Side LB 是从Oracle 9i引入的。 它的实现依赖于Listener收集负载信息。 在数据库运行过程中,PMON后台进程会收集系统的负载信息,然后登记到Listener中;最少1分钟,最多10分钟PMON就要做一个信息更新,并且如果节点的负载越高,更新频率就越高,以保证Listener能掌握每个节点准确的负载情况。
如果Listener关闭了,PMON进程会每隔1秒钟检查Listener是否重启。除了这个自动的,定时的更新任务外,用户也可以使用alter system register 命令来手工进行这个过程。
注意,实例启动时PMON进程进行的第一次登记过程叫作Server-register,而后的更新过程叫作service-update。PMON 进程不仅会向本地的Listener注册,还可以向其他节点上的Listener注册。但到底要向何处注册,是由Remote_Listeners 和Local_Listener 两个参数决定;Local_Listener 不用设置,而Remote_Listener 需要设置,参数值是一个tnsnames项。
查看监听注册时间记录:
[grid@or11g1 trace]$ pwd
/u01/app/grid/diag/tnslsnr/or11g1/listener/trace
[grid@or11g1 trace]$ cat listener.log |grep service_register
1.1.2.1、Service-Side loadBalance如何实现负载均衡?
根据PMON进程收集系统负载注册到监听中去(监听就是用来连接用的,当有连接请求时,首先会搜索监听,当发现监听中记录的节点压力比较大时,就去连其他节点)。但是如果PMON进程只是收集本地节点的负载信息并注册到本地,显然是不能实现负载均衡的目的。所以PMON不仅仅是向本地监听注册,同样也会向其他节点的listener注册。
有了PMON的自动注册机制后,集群的每个节点的Listener都掌握所有节点的负载情况,当收到客户端连接请求时,就会把连接转给负载最小的节点,这个节点有可能是自己也有可能是其他节点,也就是Listener 会转发用户的请求。
1.1.2.2、两种LB 的配置方法
1.1.2.2.1、对于Client-Side LB,需要在客户的tnsnames条目中加入LOAD_BALANCE=YES
1.1.2.2.2、对于Server-side LB,需要配置REMOTE_LISTENER这个参数
1.1.2.2.3、client-Side LB和Server-side LB并不是互斥的,可以一起使用。
首先是在客户端通过client-Side LB选择合适的节点然后再由Server-Side LB决定在哪个节点。
1.1.2.2.3.1、配置Service-side LB
在10g里面配置:在Oracle里面配置Service-side LB还需要从各个节点实例的listener.ora文件中删除缺省产生的SID_LIST_LISTENER_NodeName条目,这样才能保证Listener获得的信息是动态注册的,而不是从文件中读取的静态信息。
配置示例:
1.1.2.2.3.2、首先在各节点(也就是服务器端)的tnsname.ora中加入以下内容
SQL> show parameter name
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cell_offloadgroup_name string
db_file_name_convert string
db_name string or11g
db_unique_name string or11g
global_names boolean FALSE
instance_name string or11g2
lock_name_space string
log_file_name_convert string
processor_group_name string
service_names string or11g
SQL>
查看tnsname.ora文件
OR11G_LB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.88)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.90)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = or11g)
)
)
OR11G-LB:是连接字符串,同样也是remote_listener参数需要指定的对象
SERVICE_NAME:是节点的SERVICE_NAME,不要写成了Service-Side TAF,因为现在是在测试负载均衡而不是故障转移(业务隔离)
1.1.2.2.3.3、然后再来修改节点1的remote_listener参数的值
[oracle@or11g1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 10:42:34 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL>
SQL> show parameter remote_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string scan.oracle.com:1521
SQL>
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
or11g1
SQL>
SQL> alter system set remote_listener='or11g_lb' sid='or11g1';
System altered.
SQL>
节点2的配置如下:
配置tnsname.ora文件
OR11G_LB =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.88)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.127.90)(PORT = 1521))
)
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = or11g)
)
)
[oracle@or11g2 admin]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 11:03:09 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> show parameter remote_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string scan.oracle.com:1521
SQL>
SQL> select instance_name from v$instance;
INSTANCE_NAME
----------------
or11g2
SQL> alter system set remote_listener='or11g_lb' sid='or11g2';
System altered.
SQL> show parameter remote_listener
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
remote_listener string or11g_lb
SQL>
1.1.2.2.3.4、测试负载均衡状态
按照如下方式连接:
[oracle@or11g1 ~]$ sqlplus scott/oracle@or11g_lb
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 10:52:14 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL>
节点1上以dba身份登录,执行如下语句查看结果:
select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
[oracle@or11g1 ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.4.0 Production on Fri Apr 12 10:54:59 2019
Copyright (c) 1982, 2013, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
SQL> set linesize 120
SQL> select INST_ID,username,sid,serial# from gv$session where username='SCOTT';
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
1 SCOTT 143 47
SQL>
再打开节点1的窗口并以scott用户登录
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 29
2 SCOTT 134 127
2 SCOTT 203 17
1 SCOTT 11 25
1 SCOTT 143 49
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 29
2 SCOTT 134 127
2 SCOTT 203 17
1 SCOTT 11 25
1 SCOTT 143 49
1 SCOTT 144 73
6 rows selected.
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 29
2 SCOTT 134 127
2 SCOTT 203 17
2 SCOTT 204 15
1 SCOTT 11 25
1 SCOTT 143 49
1 SCOTT 144 73
7 rows selected.
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 29
2 SCOTT 134 127
2 SCOTT 203 17
2 SCOTT 204 15
1 SCOTT 11 25
1 SCOTT 143 49
1 SCOTT 144 73
1 SCOTT 201 31
8 rows selected.
SQL> /
INST_ID USERNAME SID SERIAL#
---------- ------------------------------ ---------- ----------
2 SCOTT 14 29
2 SCOTT 134 127
2 SCOTT 139 91
2 SCOTT 203 17
2 SCOTT 204 15
1 SCOTT 11 25
1 SCOTT 143 49
1 SCOTT 144 73
1 SCOTT 201 31
9 rows selected.
SQL>
可以看出,负载被均衡的分摊到两个节点了!
**************************************************************************************************************************************************************************************************************************************************************************************************
1.2、利用Service分散负载
要提高RAC的性能,可以从两方面入手:
一种是提高Cache Fusion的能力,这个可以使用更好的互联设备,比如G级的private network,或者使用Infiniband等DRA技术。
另一种是可以尽量减少Cache Fusion的流量,减少实例间的互相依赖。而Service就是后一种思路基础上发展出来的。
所以可以根据业务来划分,把不同业务分别放到不同的节点上。这样Cache Fusion的工作量就会急剧较少,就能从根本上解决了性能问题。
要实现这种方法,就是使用Service-Side TAF,利用RAC的Failover 功能来实现,因为Failover 可以指定优先连接哪个节点,这样创建2个不同的service,然后不同的业务连接不同的service就可以了。(即:同一业务的表,不要放在不通的节点上,这样就减少了cache Fusion)。