Parallel Processing
当执行一个 sql 语句时 , 缺省是用一个 cpu 来连续执行 , 不管有多少个 cpu. 并行执行目的是分发单个 SQL 语句到多个 CPU 核上执行 .
When you submit a SQL statement to a database engine, by default it’s executed serially by a single server process.Therefore, even if the server running the database engine has several CPU cores, your SQL statement runs on a single CPU core. The purpose of parallel processing is to distribute the execution of a single SQL statement over several CPU cores.
v$sql_monitor视图会自动监控任何并行执行的语句!
和会话相关联的处理程序行使 coordinator角色.
The coordination of the slave processes is under the control of the server process associated to the session that submits the SQL statement. Because of this role, it’s commonly called the query coordinator . The query coordinator is responsible for acquiring the slave processes, assigning a subtask to each of them, collecting and combining the partial result sets they deliver, and returning the final result set to the client.
Basic Configuration
Slave Processes Pool
最大可用并行处理在处理池里定义 . Coordinator使用完就放到池里去.
The maximum number of slave processes per database instance is limited and maintained by a database instance as a pool of slave processes. A query coordinator requests slave processes from the pool, uses them to execute one SQL statement, and finally, when the execution is complete, returns them to the pool. The following initialization parameters are set to configure the pool:
parallel_min_servers
实例启动时初始化的并行处理个数 .
parallel_max_servers
池里最大可用并行处理器个数 . 10-20倍CPU核个数是通常做法. 下面是一个32 CPU核的并行处理设置.
It’s difficult to give advice on how to set this parameter. Nevertheless, a value of 10–20 times the number of CPU cores is a good starting point.
SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 32
SQL> show parameter parallel_min_servers
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
parallel_min_servers integer 64
SQL> show parameter parallel_max
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers integer 128
通过下面的查询查看池的状态 :
SQL> SELECT * FROM v$px_process_sysstat WHERE statistic LIKE 'Servers%';
STATISTIC VALUE CON_ID
------------------------------ ---------- ----------
Servers In Use 0 0
Servers Available 66 0
Servers Started 149 0
Servers Shutdown 83 0
Servers Highwater 69 0
Servers Cleaned Up 0 0
RAC environment
每个实例有各自的并行处理器池 .
1. parallel_instance_group and instance_groups 用来限制并行处理器到指定的实例. instance_groups 从 11g 后开始不用了 . With the parallel_instance_group initialization parameter, you specify which group the slave processes are allocated from. 如orcl1.parallel_instance_group= ’ service_name ’ (指定和orcl1实例相同service的的实例上分配PX处理.
2. 从11.1开始,PX服务进程基于服务(services)是控制PX服务进程位置的首选方法. 例如,假设连接到po服务,且po服务在orcl1和orcl2 实例上都是活跃的状态 ,那么PX服务进程就能从orcl1和orcl2实例进行分配.如下例子,如果你连接到finance服务并执行并行操作,那么PX服务进程只能从ORCL3实例进行分配.
3. parallel_force_local 指定只能在当前实例上分配 PX 进程 .
Memory Utilization
…
Degree of Parallelism(DOP)
当并行处理 SQL语句时,数据库需要给出并行级别(DOP),有两种并行级别, manual 和 auto.
1. Manual degree of parallelism : In this mode, you can control the degree of parallelism either at the session, object, or SQL statement level.
2. Automatic degree of parallelism : In this mode, the database engine automatically selects the
optimal degree of parallelism for every SQL statement.
Default Degree of Parallelism
缺省的并行级别基本上就是最大的并行级别 ,如下例,两节点的RAC, DOP就是32 * 1 * 2 = 64
The default degree of parallelism is the maximum degree of parallelism you might want to use for any parallel SQL statement
default_dop = cpu_count × parallel_threads_per_cpu × number_of_instances
SQL> show parameter cpu_count
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cpu_count integer 32
SQL> show parameter parallel_threads_per_cpu
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
parallel_threads_per_cpu integer 1
Manual DOP
下面是 manual dop 例子
CREATE TABLE t (id NUMBER, pad VARCHAR2(1000)) PARALLEL 4
ALTER TABLE t PARALLEL 2
CREATE INDEX i ON t (id) PARALLEL 4
ALTER INDEX i PARALLEL 2
可能在用大批量数据装载时创建了并行的表或索引 , 但以后不需要就用下面的语句取消 .
ALTER TABLE t PARALLEL 1
ALTER INDEX i NOPARALLEL
如果用了 hint, 则重置对象上的 parallel 级别 .
SELECT /*+ parallel(t 16) */ * FROM t
SELECT /*+ parallel_index(t i 16) */ * FROM t
With the parallel hint, it ’ s also possible to explicitly call for the default degree of parallelism:
SELECT /*+ parallel(t default) */ * FROM t
Auto DOP
Auto DOP(自动并行级别) , 很难有个参数去判断什么级别是合适的,应用自动DOP, oracle去判断 什么时候用且级别是多少,基本上任何大于10秒的串行查询(query serially) 将被考虑做并行查询.
以下图表是当做表的全表扫描时 , 优化器根据表的数据量大小选择的不同的并行级别 ,
两个重要参数 :
PARALLEL_DEGREE_POLICY:
Auto DOP is enabled by setting the PARALLEL_DEGREE_POLICY parameter to a value of AUTO , LIMITED ,or ADAPTIVE (12c). The default setting for this parameter is MANUAL , which disables all three of the new 11gR2 parallel features (Auto DOP, Parallel Statement Queueing, In-Memory Parallel Execution).
PARALLEL_MIN_TIME_THRESHOLD:
如果 PARALLEL_DEGREE_POLICY 值是 AUTO, 任何大于 10 秒的串行查询 (query serially) 将被考虑做并行查询.
When Auto DOP is enabled, Oracle evaluates each statement to determine whether it should be run in parallel and, if so, what DOP should be used. Basically, any statement that the optimizer concludes will take longer than 10 seconds to run serially will be a candidate to run in parallel. The 10-second threshold can be controlled by setting the PARALLEL_MIN_TIME_THRESHOLD parameter. This decision is made regardless of whether any of the objects involved in the statement have been decorated with a parallel degree setting or not.
Operation and Configuration
When Auto DOP is enabled, Oracle evaluates each statement to determine whether it should be run in parallel and, if so, what DOP should be used. Basically, any statement that the optimizer concludes will take longer than 10 seconds to run serially will be a candidate to run in parallel. The 10-second threshold can be controlled by setting the PARALLEL_MIN_TIME_THRESHOLD parameter. This decision is made regardless of whether any of the objects involved in the statement have been decorated with a parallel degree setting or not.
Auto DOP is enabled by setting the PARALLEL_DEGREE_POLICY parameter to a value of AUTO , LIMITED ,
or ADAPTIVE (12c). The default setting for this parameter is MANUAL , which disables all three of the new 11gR2 parallel features (Auto DOP, Parallel Statement Queueing, In-Memory Parallel Execution). Unfortunately, PARALLEL_DEGREE_POLICY is one of those parameters that control more than one thing. The following list shows the effects of the various settings for this parameter:
MANUAL : If PARALLEL_DEGREE_POLICY is set to MANUAL , none of the new 11gR2
parallel features will be enabled. Parallel processing will work as it did in
previous versions. That is to say, statements will only be parallelized if a hint is
used or an object is decorated with a parallel setting .
LIMITED : If PARALLEL_DEGREE_POLICY is set to LIMITED , only Auto DOP is
enabled while Parallel Statement Queueing and In-Memory Parallel Execution
remain disabled. In addition, only statements accessing objects that have been
decorated with the default parallel setting will be considered for Auto DOP
calculation.
AUTO : If PARALLEL_DEGREE_POLICY is set to AUTO , all three of the new features are
enabled. Statements will be evaluated for parallel execution regardless of any
parallel decoration at the object level.
ADAPTIVE (12c) : This new 12c parameter enables the same functionality as the
previously discussed AUTO value. In addition to these, Oracle may re-evaluate
the statement in order to provide a better degree of parallelism for subsequent
executions based on feedback gathered during statement execution.
Although the only documented way to enable Parallel Statement Queueing and In-Memory Parallel
Execution is via the all-or-nothing setting of AUTO or ADAPTIVE , the developers have thoughtfully provided hidden parameters that provide independent control of these features. Table 6-3 shows the parameters and how the settings of PARALLEL_DEGREE_POLICY alter the hidden parameters.
When to Use It
仅仅在两种情况下考虑用并行
1. First, you can use it when plenty of free resources (CPU, memory, and disk I/O bandwidth) are available.
2. only SQL statements that take more than a few minutes, or even longer, are good candidates for being executed in parallel.
Parallel processing should be used only when two conditions are met. First, you can use it when plenty of free resources (CPU, memory, and disk I/O bandwidth) are available. Remember, the aim of parallel processing is to reduce the response time by distributing the work usually done by a single process (and hence a single CPU core) to several processes (and hence several CPU cores). Second, you can use it for SQL statements that take more than a dozen seconds to execute serially; otherwise, the time and resources needed to initialize, coordinate and terminate
the parallel environment (mainly, the slave processes and the table queues) might be higher than the time gained by the parallelization itself. The actual limit depends on the amount of resources that are available. Therefore, in some situations, only SQL statements that take more than a few minutes, or even longer, are good candidates for being executed in parallel. It’s important to stress that if these two conditions aren’t met, performance could decrease instead of increase.
If parallel processing is commonly used for many SQL statements, either automatic degree of parallelism is enabled at the system level or manual degree of parallelism is enabled at the segment levels. Otherwise, if it’s used only for specific batches or reports, it’s usually better to enable it at the session level or through hints.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/14377/viewspace-2305558/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/14377/viewspace-2305558/