该项目是中国联通xxxx话务系统,我的架构设计+需求设计,+运维保障+数据库开发,全套服务。
在今天开发完毕后,突然有个模块的需求,用户号码为必须选择,感觉有点郁闷,因为1小时有1000w数据,把所有用户号码显示出来,是不是有点画蛇添足呢。
我的开始设想是查询详单,像中移营业厅,需要输入号码,或者省份证查询模糊查询,没有谓词不能查询。(感觉设计合情合理)
1.但是想了解整个系统用户分布情况,必须输入条件,是不是有点不可用。
2.并且没有谓词过滤,查询会慢,非常慢(1-2分钟出结果),目标是3-5秒内出数据。
注:优化难点是把2秒变成1秒, 反之,把2小时变成2分钟非常简单。
第1步:
下面看看语句和执行计划:
SQL> explain plan for SELECT /*+ parallel(8) */
2 starttime starttime,
3 cv.groupid,
4 cs.custmangerid,
5 callercarrier callercarrier,
6 callernum callernum,
7 calledcarrier calledcarrier,
8 callednum callednum,
9 calleenum calleenum,
10 round(duration / 60, 2) CallTimeLen,
11 count(*) over(ORDER BY NULL ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) "@totalrows"
12 FROM CS_xxxx dt, cfg_vipphones cv, cfg_vipusers cs
13 WHERE dt.StartTime >= '2013-05-31 13:00:00'
14 and dt.StartTime < '2013-05-31 14:00:00'
15 AND dt.Callercarrier = 2
16 AND dt.callernum >= cv.beginphone
17 and dt.callernum <= cv.endphone
18 and cv.groupid = cs.groupid;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2172492340
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 478K| 34
|* 1 | PX COORDINATOR | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 478K| 34
| 3 | WINDOW BUFFER | | 478K| 34
|* 4 | FILTER | | |
| 5 | MERGE JOIN | | 478K| 34
| 6 | SORT JOIN | | 11 | 363
| 7 | BUFFER SORT | | |
| 8 | PX RECEIVE | | |
| 9 | PX SEND BROADCAST | :TQ10000 | |
| 10 | NESTED LOOPS | | |
| 11 | NESTED LOOPS | | 11 | 363
| 12 | TABLE ACCESS BY INDEX ROWID| CFG_VIPUSERS | 3 | 18
| 13 | INDEX FULL SCAN | PK_CFG_VIPUSERS | 3 |
|* 14 | INDEX RANGE SCAN | VIPUSERS_FK | 4 |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 15 | TABLE ACCESS BY INDEX ROWID | CFG_VIPPHONES | 4 | 108
|* 16 | FILTER | | |
|* 17 | SORT JOIN | | 516K| 21
| 18 | PX BLOCK ITERATOR | | 516K| 21
|* 19 | TABLE ACCESS FULL | CS_xxxx | 516K| 21
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_DATE('2013-05-31 13:00:00')<TO_DATE('2013-05-31 14:00:00'))
4 - filter(TO_DATE('2013-05-31 13:00:00')<TO_DATE('2013-05-31 14:00:00'))
14 - access("CV"."GROUPID"="CS"."GROUPID")
16 - filter("DT"."CALLERNUM"<="CV"."ENDPHONE")
17 - access("DT"."CALLERNUM">="CV"."BEGINPHONE")
filter("DT"."CALLERNUM">="CV"."BEGINPHONE")
19 - filter("DT"."CALLERCARRIER"=2 AND "DT"."STARTTIME">='2013-05-31 13:00:00'
Note
-----
- Degree of Parallelism is 8 because of hint
41 rows selected
SQL>
大家看出问题了吗,说实话,执行计划只是一个参考,看看index是否生效,是不是全表scan,nloop,hash,是不是可以增加use_nl, 等hint
OLAP和OLTP 又有很大区别了,包含数据库参数设定,sql写法,hint是否启用等
第2步:
我怀疑是3张表关联,谓词出了问题。
注意看filter,看看是否是分区表搞的鬼。 查看后一切正常,因为是我写的,我最清楚。哈哈。。。
在多表关联时,如果有视图,可以考虑视图的合并,关联的优先选择,再hash。 都试过了,不行。
第3 步:
怀疑是并行出错了,看看表的并且度,索引并行,
或者我不要并行试试。果然,8-10秒出结果
SQL> explain plan for SELECT
2 starttime starttime,
3 cv.groupid,
4 cs.custmangerid,
5 callercarrier callercarrier,
6 callernum callernum,
7 calledcarrier calledcarrier,
8 callednum callednum,
9 calleenum calleenum,
10 round(duration / 60, 2) CallTimeLen,
11 count(*) over(ORDER BY NULL ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) "@totalrows"
12 FROM CS——xx dt, cxg_vippxxx cv, cxg_vipxxx cs
13 WHERE dt.StartTime >= '2013-05-31 13:00:00'
14 and dt.StartTime < '2013-05-31 14:00:00'
15 AND dt.Callercarrier = 2
16 AND dt.callernum >= cv.beginphone
17 and dt.callernum <= cv.endphone
18 and cv.groupid = cs.groupid;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1705527799
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |Tem
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 478K| 34M|
| 1 | WINDOW BUFFER | | 478K| 34M|
|* 2 | FILTER | | | |
| 3 | MERGE JOIN | | 478K| 34M|
| 4 | SORT JOIN | | 11 | 363 |
| 5 | NESTED LOOPS | | | |
| 6 | NESTED LOOPS | | 11 | 363 |
| 7 | TABLE ACCESS BY INDEX ROWID| CFGxxUSERS | 3 | 18 |
| 8 | INDEX FULL SCAN | PK_CFG_VIPUSERS | 3 | |
|* 9 | INDEX RANGE SCAN | VIPUSERS_FK | 4 | |
| 10 | TABLE ACCESS BY INDEX ROWID | CFG_xxNES | 4 | 108 |
|* 11 | FILTER | | | |
|* 12 | SORT JOIN | | 516K| 21M|
| 13 | PARTITION RANGE ITERATOR | | 516K| 21M|
|* 14 | TABLE ACCESS FULL | CS_xxx | 516K| 21M|
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(TO_DATE('2013-05-31 13:00:00')<TO_DATE('2013-05-31 14:00:00'))
9 - access("CV"."GROUPID"="CS"."GROUPID")
11 - filter("DT"."CALLERNUM"<="CV"."ENDPHONE")
12 - access("DT"."CALLERNUM">="CV"."BEGINPHONE")
filter("DT"."CALLERNUM">="CV"."BEGINPHONE")
14 - filter("DT"."CALLERCARRIER"=2 AND "DT"."STARTTIME">='2013-05-31 13:00:00'
14:00:00')
32 rows selected
SQL>
第4步:
看看并行设置,这个也有很大关系,因为并行的模块太多,造成排队拥塞的情况
<1>如果有并行度低于系统最大并行数的查询在跑,那接下来的并行查询会怎么跑呢?
When you specify parallel degree 4 oracle tries to allocate 4 producer slaves and 4 consumer slaves. The producers can feed any of the consumers.
If there are only 2 slaves available then we use these.
If there is only 1 slave available then we go serial
If there are none available then we use serial.
If parallel_min_percent is set then we error ora 12827 instead of using a lower number of slaves or going serial
<2>设定parallel_max_servers 多大为好?
在多CPU的环境中,一般把CPU-1或CPU的数量做个最大并行数,因为并行查询运行时还需要一个进程协调各并行进程.对于单CPU没什么好说的.
<3>并行查询能提高系统的性能吗?
并行查询运行时,很容易会使机器运行在高负荷下,令系统对其它事务的处理时间大大加长.并行查询一般适合在非业务高峰值人工执行,并不适合在程序中指定运行并行查询.
PINNER:
并行不等于快速,仅仅是适合在数据仓库环境,低业务请求与低并发操作的时候
典型的OLTP系统,如果我们的系统,是绝对不允许并行查询出现的。
(引荐哈)
第5步:
问题解决,注意看看问题,paralle的写法,当一个表时,用parallel(8) , 表示当前表并行8个进程
当有多个表是,请指定某一个表,否则会默认3个表,当然执行计划上看不出来,可以trace一把 看看
SQL> explain plan for SELECT /*+ parallel(dt,8) */
2 starttime starttime,
3 cv.groupid,
4 cs.custmangerid,
5 callercarrier callercarrier,
6 callernum callernum,
7 calledcarrier calledcarrier,
8 callednum callednum,
9 calleenum calleenum,
10 round(duration / 60, 2) CallTimeLen,
11 count(*) over(ORDER BY NULL ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) "@totalrows"
12 FROM CS_CDR dt, cfg_vipphones cv, cfg_vipusers cs
13 WHERE dt.StartTime >= '2013-05-31 13:00:00'
14 and dt.StartTime < '2013-05-31 14:00:00'
15 AND dt.Callercarrier = 2
16 AND dt.callernum >= cv.beginphone
17 and dt.callernum <= cv.endphone
18 and cv.groupid = cs.groupid;
Explained
SQL> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 2172492340
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 478K| 34
|* 1 | PX COORDINATOR | | |
| 2 | PX SEND QC (RANDOM) | :TQ10001 | 478K| 34
| 3 | WINDOW BUFFER | | 478K| 34
|* 4 | FILTER | | |
| 5 | MERGE JOIN | | 478K| 34
| 6 | SORT JOIN | | 11 | 363
| 7 | BUFFER SORT | | |
| 8 | PX RECEIVE | | |
| 9 | PX SEND BROADCAST | :TQ10000 | |
| 10 | NESTED LOOPS | | |
| 11 | NESTED LOOPS | | 11 | 363
| 12 | TABLE ACCESS BY INDEX ROWID| CFG_VIPUSERS | 3 | 18
| 13 | INDEX FULL SCAN | PK_CFG_VIPUSERS | 3 |
|* 14 | INDEX RANGE SCAN | VIPUSERS_FK | 4 |
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
| 15 | TABLE ACCESS BY INDEX ROWID | CFG_VIPPHONES | 4 | 108
|* 16 | FILTER | | |
|* 17 | SORT JOIN | | 516K| 21
| 18 | PX BLOCK ITERATOR | | 516K| 21
|* 19 | TABLE ACCESS FULL | CS_xxxx | 516K| 21
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_DATE('2013-05-31 13:00:00')<TO_DATE('2013-05-31 14:00:00'))
4 - filter(TO_DATE('2013-05-31 13:00:00')<TO_DATE('2013-05-31 14:00:00'))
14 - access("CV"."GROUPID"="CS"."GROUPID")
16 - filter("DT"."CALLERNUM"<="CV"."ENDPHONE")
17 - access("DT"."CALLERNUM">="CV"."BEGINPHONE")
filter("DT"."CALLERNUM">="CV"."BEGINPHONE")
19 - filter("DT"."CALLERCARRIER"=2 AND "DT"."STARTTIME">='2013-05-31 13:00:00'
37 rows selected
SQL>
目前是3秒出结果,已经达到预期,当然谓词为1小时,或者有号码过滤绝对是1秒内响应速度。