1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
|
16
:
13
:
07
SCOTT@ test1 >create table emp2
as
select *
from
emp;
Table created.
16
:
13
:
26
SCOTT@ test1 >alter table emp2 parallel
2
;
Table altered.
16
:
13
:
37
SCOTT@ test1 >set autotrace trace
16
:
14
:
20
SCOTT@ test1 >select sum(sal)
from
emp2 group by deptno
Elapsed:
00
:
00
:
00.01
Execution Plan
----------------------------------------------------------
Plan hash value:
3939201228
------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------
|
0
| SELECT STATEMENT | |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10001 |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| Q1,
01
| P->S | QC (RAND) |
|
3
| HASH GROUP BY | |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|
4
| PX RECEIVE | |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|
5
| PX SEND HASH | :TQ10000 |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| Q1,
00
| P->P | HASH |
|
6
| HASH GROUP BY | |
14
|
364
|
3
(
34
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
|
7
| PX BLOCK ITERATOR | |
14
|
364
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWC | |
|
8
| TABLE ACCESS FULL| EMP2 |
14
|
364
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
------------------------------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used
for
this
statement (level=
2
)
Statistics
----------------------------------------------------------
12
recursive calls
0
db block gets
6
consistent gets
0
physical reads
0
redo size
471
bytes sent via SQL*Net to client
415
bytes received via SQL*Net
from
client
2
SQL*Net roundtrips to/
from
client
0
sorts (memory)
0
sorts (disk)
3
rows processed
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
17
:
07
:
28
SCOTT@ test1 >alter session set events
'10046 trace name context forever,level 12'
;
Session altered.
17
:
07
:
32
SCOTT@ test1 >create table emp3 parallel
4
as
select
/*+ parallel(4) */
*
from
emp;
Table created.
17
:
08
:
13
SCOTT@ test1 >alter session set events
'10046 trace name context off'
;
Session altered.
截取trace文件内容:
SQL ID: 4mq0vusuv4pf9
Plan Hash:
4200853112
create table emp3 parallel
4
as
select
/*+ parallel(4) */
*
from
emp
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse
1
0.00
0.02
0
0
1
0
Execute
1
0.01
0.76
0
5
9
14
Fetch
0
0.00
0.00
0
0
0
0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total
2
0.02
0.78
0
5
10
14
Misses
in
library cache during parse:
1
Optimizer mode: ALL_ROWS
Parsing user id:
72
Rows Row Source Operation
------- ---------------------------------------------------
1
PX COORDINATOR (cr=
3
pr=
0
pw=
0
time=
0
us)
0
PX SEND QC (RANDOM) :TQ10000 (cr=
0
pr=
0
pw=
0
time=
0
us cost=
2
size=
532
card=
14
)
0
LOAD AS SELECT (cr=
0
pr=
0
pw=
0
time=
0
us)
0
PX BLOCK ITERATOR (cr=
0
pr=
0
pw=
0
time=
0
us cost=
2
size=
532
card=
14
)
0
TABLE ACCESS FULL EMP (cr=
0
pr=
0
pw=
0
time=
0
us cost=
2
size=
532
card=
14
)
Elapsed times include waiting
on
following events:
Event waited
on
Times Max. Wait Total Waited
---------------------------------------- Waited ---------- ------------
os thread startup
4
0.01
0.06
PX Deq: Join ACK
3
0.00
0.00
PX Deq: Parse Reply
4
0.07
0.08
PX Deq: Execute Reply
13
0.29
0.40
log file sync
1
0.03
0.03
PX Deq: Signal ACK EXT
4
0.07
0.14
SQL*Net message to client
1
0.00
0.00
SQL*Net message
from
client
1
11.30
11.30
********************************************************************************
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
17
:
21
:
53
SYS@ test1 >conn scott/tiger
Connected.
17
:
22
:
03
SCOTT@ test1 >alter session enable parallel dml;
Session altered.
17
:
35
:
21
SCOTT@ test1 >explain plan
for
delete
/*+ parallel(2) */
from
emp3 ;
Explained.
17
:
36
:
07
SCOTT@ test1 >select *
from
table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value:
312603143
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-------------------------------------------------------------------------------------------------------
|
0
| DELETE STATEMENT | |
11
|
2
(
0
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10000 |
11
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| P->S | QC (RAND) |
|
3
| DELETE | EMP3 | | | | Q1,
00
| PCWP | |
|
4
| PX BLOCK ITERATOR | |
11
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWC | |
|
5
| TABLE ACCESS FULL| EMP3 |
11
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
-------------------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used
for
this
statement (level=
2
)
- Degree of Parallelism is
2
because of hint
17
rows selected.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
|
17
:
38
:
48
SCOTT@ test1 >explain plan
for
update
/*+ paralle(2) */
emp3 set sal=
8000
where
empno=
7788
;
Explained.
Elapsed:
00
:
00
:
00.01
17
:
39
:
27
SCOTT@ test1 >select *
from
table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value:
1997329255
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------
|
0
| UPDATE STATEMENT | |
1
|
26
|
2
(
0
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10000 |
1
|
26
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| P->S | QC (RAND) |
|
3
| UPDATE | EMP3 | | | | | Q1,
00
| PCWP | |
|
4
| PX BLOCK ITERATOR | |
1
|
26
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWC | |
|*
5
| TABLE ACCESS FULL| EMP3 |
1
|
26
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5
- filter(
"EMPNO"
=
7788
)
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Note
-----
- dynamic sampling used
for
this
statement (level=
2
)
21
rows selected.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
17
:
41
:
21
SCOTT@ test1 >select *
from
table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value:
4200853112
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
---------------------------------------------------------------------------------------------------------------
|
0
| INSERT STATEMENT | |
14
|
532
|
2
(
0
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10000 |
14
|
532
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| P->S | QC (RAND) |
|
3
| LOAD AS SELECT | EMP3 | | | | | Q1,
00
| PCWP | |
|
4
| PX BLOCK ITERATOR | |
14
|
532
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWC | |
|
5
| TABLE ACCESS FULL| EMP |
14
|
532
|
2
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
---------------------------------------------------------------------------------------------------------------
Note
-----
- Degree of Parallelism is
2
because of hint
16
rows selected.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
17
:
43
:
52
SCOTT@ test1 >select *
from
table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
Plan hash value:
883381916
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|
0
| INSERT STATEMENT | |
14
|
532
|
3
(
0
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10001 |
14
|
532
|
3
(
0
)|
00
:
00
:
01
| Q1,
01
| P->S | QC (RAND) |
|
3
| LOAD AS SELECT | EMP3 | | | | | Q1,
01
| PCWP | |
|
4
| PX RECEIVE | |
14
|
532
|
3
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|
5
| PX SEND ROUND-ROBIN| :TQ10000 |
14
|
532
|
3
(
0
)|
00
:
00
:
01
| | S->P | RND-ROBIN |
|
6
| TABLE ACCESS FULL | EMP |
14
|
532
|
3
(
0
)|
00
:
00
:
01
| | | |
-----------------------------------------------------------------------------------------------------------------
13
rows selected.
|
-
MANUAL: Disables automatic degree of parallelism, statement queuing, and in-memory parallel execution. This reverts the behavior of parallel execution to what it was prior to Oracle Database 11g Release 2 (11.2). This is the default.
-
LIMITED: Enables automatic degree of parallelism for some statements but statement queuing and in-memory Parallel Execution are disabled. Automatic degree of parallelism is only applied to those statements that access tables or indexes decorated explicitly with the DEFAULT degree of parallelism using the PARALLEL clause. Statements that do not access any tables or indexes decorated with the DEFAULT degree of parallelism will retain the MANUAL behavior.
-
AUTO: Enables automatic degree of parallelism, statement queuing, and in-memory parallel execution.
-
CPU: The maximum degree of parallelism is limited by the number of CPUs in the system. The formula used to calculate the limit is PARALLEL_THREADS_PER_CPU * CPU_COUNT * the number of instances available (by default, all the opened instances on the cluster but can be constrained using PARALLEL_INSTANCE_GROUP or service specification). This is the default.
-
IO: The maximum degree of parallelism the optimizer can use is limited by the I/O capacity of the system. The value is calculated by dividing the total system throughput by the maximum I/O bandwidth per process. You must run the DBMS_RESOURCE_MANAGER.CALIBRATE_IO procedure on the system in order to use the IO setting. This procedure will calculate the total system throughput and the maximum I/O bandwidth per process.
-
integer: A numeric value for this parameter specifies the maximum degree of parallelism the optimizer can choose for a SQL statement when automatic degree of parallelism is active. Automatic degree of parallelism is only enabled if PARALLEL_DEGREE_POLICY is set to AUTO or LIMITED.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
SQL> ALTER TABLE customers PARALLEL(DEGREE
4
);
SQL> ALTER TABLE sales NOPARALLEL ;
SQL> explain plan
for
select
/*+ ordered use_hash(t1) */
t.name,sum(t.id)
from
t,t1
where
t.id=t1.object_id group by t.name;
SQL> select *
from
table(dbms_xplan.display)
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value:
2575143521
----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
----------------------------------------------------------------------------------------------------------------------
|
0
| SELECT STATEMENT | |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10002 |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| Q1,
02
| P->S | QC (RAND) |
|
3
| HASH GROUP BY | |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| Q1,
02
| PCWP | |
|
4
| PX RECEIVE | |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| Q1,
02
| PCWP | |
|
5
| PX SEND HASH | :TQ10001 |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| Q1,
01
| P->P | HASH |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
|
6
| HASH GROUP BY | |
61
|
4758
|
22
(
10
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|*
7
| HASH JOIN | |
61
|
4758
|
21
(
5
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|
8
| BUFFER SORT | | | | | | Q1,
01
| PCWC | |
|
9
| PX RECEIVE | |
61
|
3965
|
12
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
|
10
| PX SEND BROADCAST | :TQ10000 |
61
|
3965
|
12
(
0
)|
00
:
00
:
01
| | S->P | BROADCAST |
|
11
| TABLE ACCESS FULL | T |
61
|
3965
|
12
(
0
)|
00
:
00
:
01
| | | |
|
12
| PX BLOCK ITERATOR | |
52078
| 661K|
8
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWC | |
|
13
| INDEX FAST FULL SCAN| T1_IDX |
52078
| 661K|
8
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
----------------------------------------------------------------------------------------------------------------------
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
|
Monitor parallel execution
SQL> select *
from
v$pq_sesstat;
STATISTIC LAST_QUERY SESSION_TOTAL
------------------------------ ---------- -------------
Queries Parallelized
1
13
DML Parallelized
0
0
DDL Parallelized
0
0
DFO Trees
1
13
Server Threads
6
0
Allocation Height
3
0
Allocation Width
1
0
Local Msgs Sent
362
303740
Distr Msgs Sent
0
0
Local Msgs Recv'd
368
303782
Distr Msgs Recv'd
0
0
11
rows selected.
SQL> select DFO_NUMBER, TQ_ID, SERVER_TYPE, NUM_ROWS ,BYTES,process
from
v$pq_tqstat order by dfo_number , tq_id , server_type;
DFO_NUMBER TQ_ID SERVER_TYPE NUM_ROWS BYTES PROCESS
---------- ---------- --------------- ---------- ---------- --------
1
0
Consumer
17151
100454
P001
1
0
Consumer
17242
100969
P002
1
0
Consumer
17257
101058
P000
1
0
Producer
1971
9955
P004
1
0
Producer
29565
174989
P005
1
0
Producer
20114
117537
P003
1
1
Consumer
893
8107
P002
1
1
Consumer
2914
26341
P001
1
1
Consumer
0
60
P000
1
1
Producer
611
5494
P003
1
1
Producer
2593
23493
P005
1
1
Producer
603
5521
P004
1
2
Consumer
3807
19040
QC
1
2
Producer
2914
14545
P001
1
2
Producer
893
4475
P002
1
2
Producer
0
20
P000
16
rows selected.
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
SQL> @utlxplp
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------
Plan hash value:
1177066807
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------
|
0
| SELECT STATEMENT | |
3867
| 147K|
13
(
8
)|
00
:
00
:
01
| | | |
|
1
| PX COORDINATOR | | | | | | | | |
|
2
| PX SEND QC (RANDOM) | :TQ10002 |
3867
| 147K|
13
(
8
)|
00
:
00
:
01
| Q1,
02
| P->S | QC (RAND) |
|*
3
| HASH JOIN BUFFERED | |
3867
| 147K|
13
(
8
)|
00
:
00
:
01
| Q1,
02
| PCWP | |
|
4
| PX RECEIVE | |
53332
| 677K|
7
(
0
)|
00
:
00
:
01
| Q1,
02
| PCWP | |
|
5
| PX SEND HASH | :TQ10000 |
53332
| 677K|
7
(
0
)|
00
:
00
:
01
| Q1,
00
| P->P | HASH |
|
6
| PX BLOCK ITERATOR | |
53332
| 677K|
7
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWC | |
|
7
| TABLE ACCESS FULL| PRODUCTS |
53332
| 677K|
7
(
0
)|
00
:
00
:
01
| Q1,
00
| PCWP | |
|
8
| PX RECEIVE | |
3867
| 98K|
5
(
0
)|
00
:
00
:
01
| Q1,
02
| PCWP | |
|
9
| PX SEND HASH | :TQ10001 |
3867
| 98K|
5
(
0
)|
00
:
00
:
01
| Q1,
01
| P->P | HASH |
|
10
| PX BLOCK ITERATOR | |
3867
| 98K|
5
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWC | |
|
11
| TABLE ACCESS FULL| COSTS |
3867
| 98K|
5
(
0
)|
00
:
00
:
01
| Q1,
01
| PCWP | |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3
- access(
"C"
.
"PROD_ID"
=
"P"
.
"PROD_ID"
)
|
-
One of my favourite descriptions of performance tuning, althou gh I can’t remember where I first heard it, is that it is based on ‘informed common sense’. That really captures my ow n experiences of performance tu ning. Yes, you need to use proper analysis techniques and often a great deal of technical knowledge, but that’s all devalued if you’re completely missing the point . So let’s take a step away from the technical and consider the big picture.
-
Don’t even think about implementing Parallel Execution un less you are prepared to invest some time in initial testing, followed by ongoing performance monitoring. If you don’t, you might one day hit performance problems either server-wide or on an individual user session that you’d never believe (until it happens to you).
-
Parallel Execution is designed to utilise hardware as heavily as possible. If you are running on a single-CPU server with two hard disk drives and 512Mb RAM, don’t expect significant perfor mance improvements just because you switch PX on. The more CPUs , disk drives, controllers and RAM you have installed on your server, the better the results are going to be.
-
Although you may be able to use Parallel Execution to make an inefficient SQL stat ement run many times faster, that would be incredibly stupid. It’s essential that you tune the SQL first . In the end, doing more work than you should be, but more quickly, is still doing more work than you should be! To put it another way, don’t use PX as a dressing for a poorly designed application. Reduce the wo rkload to the minimum needed to achieve the task and then start using the server facilities to make it run as quickly as possible. Seems obvious, doesn’t it?
-
If you try to use PX to benefit a large number of user s performing online queries yo u may eventually bring the server to its knees. Well, maybe not if you use th e Adaptive Multi-User algorithm, but then it’s essential that both you and, more important, your users unders tand that response time is going to be very variable when the machine gets busy.
-
Using PX for a query that runs in a few seconds is pointless. You’re just going to use more resources on the server for very little improvemen t in the run time of the query. It might well run more slowly!
-
Sometimes when faced with a slow i/o subsystem you migh t find that higher degrees of parallelism are useful because the CPUs are spending more ti me waiting for i/o to complete. Ther efore they are more likely to be available for another PX slave (that isn’t waiting on i/o) to use. This was certainly my experience at one site. However, it’s also true that using PX will usually lead to a busier i/o subsystem because the server is likely to favour full scans over indexed retrieva l. There are no easy answers here - you really need to carry out some analysis of overall system resource usage to identify where the bottlenecks are an d adjust the configuration accordingly.
-
Consider whether PX is the correct parallel solution for overnight batch operations. It may be that you can achieve better performance using multip le streams of jobs, each single-threa ded, or maybe you would be better with one stream of jobs which uses PX. It depends on your application so the only sure way to find out is to try the different approaches .