并行处理 Parallel Processing

最新推荐文章于 2022-10-12 14:33:30 发布

congfen4131

最新推荐文章于 2022-10-12 14:33:30 发布

阅读量1.2k

点赞数

Parallel Processing

当执行一个 sql 语句时 , 缺省是用一个 cpu 来连续执行 , 不管有多少个 cpu. 并行执行目的是分发单个 SQL 语句到多个 CPU 核上执行 .

When you submit a SQL statement to a database engine, by default it’s executed serially by a single server process.Therefore, even if the server running the database engine has several CPU cores, your SQL statement runs on a single CPU core. The purpose of parallel processing is to distribute the execution of a single SQL statement over several CPU cores.

v$sql_monitor视图会自动监控任何并行执行的语句！

和会话相关联的处理程序行使 coordinator角色.

The coordination of the slave processes is under the control of the server process associated to the session that submits the SQL statement. Because of this role, it’s commonly called the query coordinator . The query coordinator is responsible for acquiring the slave processes, assigning a subtask to each of them, collecting and combining the partial result sets they deliver, and returning the final result set to the client.

Basic Configuration

Slave Processes Pool

最大可用并行处理在处理池里定义 . Coordinator使用完就放到池里去.

The maximum number of slave processes per database instance is limited and maintained by a database instance as a pool of slave processes. A query coordinator requests slave processes from the pool, uses them to execute one SQL statement, and finally, when the execution is complete, returns them to the pool. The following initialization parameters are set to configure the pool:

parallel_min_servers

实例启动时初始化的并行处理个数 .

parallel_max_servers

池里最大可用并行处理器个数 . 10-20倍CPU核个数是通常做法. 下面是一个32 CPU核的并行处理设置.

It’s difficult to give advice on how to set this parameter. Nevertheless, a value of 10–20 times the number of CPU cores is a good starting point.

SQL> show parameter cpu_count

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

cpu_count integer 32

SQL> show parameter parallel_min_servers

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

parallel_min_servers integer 64

SQL> show parameter parallel_max

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

parallel_max_servers integer 128

通过下面的查询查看池的状态 :

SQL> SELECT * FROM v$px_process_sysstat WHERE statistic LIKE 'Servers%';

STATISTIC VALUE CON_ID

------------------------------ ---------- ----------

Servers In Use 0 0

Servers Available 66 0

Servers Started 149 0

Servers Shutdown 83 0

Servers Highwater 69 0

Servers Cleaned Up 0 0

RAC environment

每个实例有各自的并行处理器池 .

1. parallel_instance_group and instance_groups 用来限制并行处理器到指定的实例. instance_groups 从 11g 后开始不用了 . With the parallel_instance_group initialization parameter, you specify which group the slave processes are allocated from. 如orcl1.parallel_instance_group= ’ service_name ’ (指定和orcl1实例相同service的的实例上分配PX处理.

2. 从11.1开始,PX服务进程基于服务(services)是控制PX服务进程位置的首选方法. 例如,假设连接到po服务,且po服务在orcl1和orcl2 实例上都是活跃的状态 ,那么PX服务进程就能从orcl1和orcl2实例进行分配.如下例子,如果你连接到finance服务并执行并行操作,那么PX服务进程只能从ORCL3实例进行分配.

3. parallel_force_local 指定只能在当前实例上分配 PX 进程 .

Memory Utilization

…

Degree of Parallelism(DOP)

当并行处理 SQL语句时,数据库需要给出并行级别(DOP),有两种并行级别, manual 和 auto.

1. Manual degree of parallelism : In this mode, you can control the degree of parallelism either at the session, object, or SQL statement level.

2. Automatic degree of parallelism : In this mode, the database engine automatically selects the

optimal degree of parallelism for every SQL statement.

Default Degree of Parallelism

缺省的并行级别基本上就是最大的并行级别 ,如下例,两节点的RAC, DOP就是32 * 1 * 2 = 64

The default degree of parallelism is the maximum degree of parallelism you might want to use for any parallel SQL statement

default_dop = cpu_count × parallel_threads_per_cpu × number_of_instances

SQL> show parameter cpu_count

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

cpu_count integer 32

SQL> show parameter parallel_threads_per_cpu

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

parallel_threads_per_cpu integer 1

Manual DOP

下面是 manual dop 例子

CREATE TABLE t (id NUMBER, pad VARCHAR2(1000)) PARALLEL 4

ALTER TABLE t PARALLEL 2

CREATE INDEX i ON t (id) PARALLEL 4

ALTER INDEX i PARALLEL 2

可能在用大批量数据装载时创建了并行的表或索引 , 但以后不需要就用下面的语句取消 .

ALTER TABLE t PARALLEL 1

ALTER INDEX i NOPARALLEL

如果用了 hint, 则重置对象上的 parallel 级别 .

SELECT /*+ parallel(t 16) */ * FROM t

SELECT /*+ parallel_index(t i 16) */ * FROM t

With the parallel hint, it ’ s also possible to explicitly call for the default degree of parallelism:

SELECT /*+ parallel(t default) */ * FROM t

Auto DOP

Auto DOP（自动并行级别），很难有个参数去判断什么级别是合适的，应用自动DOP， oracle去判断什么时候用且级别是多少，基本上任何大于10秒的串行查询(query serially) 将被考虑做并行查询.

以下图表是当做表的全表扫描时 , 优化器根据表的数据量大小选择的不同的并行级别 ,

两个重要参数 :

PARALLEL_DEGREE_POLICY:

Auto DOP is enabled by setting the PARALLEL_DEGREE_POLICY parameter to a value of AUTO , LIMITED ,or ADAPTIVE (12c). The default setting for this parameter is MANUAL , which disables all three of the new 11gR2 parallel features (Auto DOP, Parallel Statement Queueing, In-Memory Parallel Execution).

PARALLEL_MIN_TIME_THRESHOLD:

如果 PARALLEL_DEGREE_POLICY 值是 AUTO, 任何大于 10 秒的串行查询 (query serially) 将被考虑做并行查询.

When Auto DOP is enabled, Oracle evaluates each statement to determine whether it should be run in parallel and, if so, what DOP should be used. Basically, any statement that the optimizer concludes will take longer than 10 seconds to run serially will be a candidate to run in parallel. The 10-second threshold can be controlled by setting the PARALLEL_MIN_TIME_THRESHOLD parameter. This decision is made regardless of whether any of the objects involved in the statement have been decorated with a parallel degree setting or not.

Operation and Configuration

When Auto DOP is enabled, Oracle evaluates each statement to determine whether it should be run in parallel and, if so, what DOP should be used. Basically, any statement that the optimizer concludes will take longer than 10 seconds to run serially will be a candidate to run in parallel. The 10-second threshold can be controlled by setting the PARALLEL_MIN_TIME_THRESHOLD parameter. This decision is made regardless of whether any of the objects involved in the statement have been decorated with a parallel degree setting or not.

Auto DOP is enabled by setting the PARALLEL_DEGREE_POLICY parameter to a value of AUTO , LIMITED ,

or ADAPTIVE (12c). The default setting for this parameter is MANUAL , which disables all three of the new 11gR2 parallel features (Auto DOP, Parallel Statement Queueing, In-Memory Parallel Execution). Unfortunately, PARALLEL_DEGREE_POLICY is one of those parameters that control more than one thing. The following list shows the effects of the various settings for this parameter:

MANUAL : If PARALLEL_DEGREE_POLICY is set to MANUAL , none of the new 11gR2

parallel features will be enabled. Parallel processing will work as it did in

previous versions. That is to say, statements will only be parallelized if a hint is

used or an object is decorated with a parallel setting .

LIMITED : If PARALLEL_DEGREE_POLICY is set to LIMITED , only Auto DOP is

enabled while Parallel Statement Queueing and In-Memory Parallel Execution

remain disabled. In addition, only statements accessing objects that have been

decorated with the default parallel setting will be considered for Auto DOP

calculation.

AUTO : If PARALLEL_DEGREE_POLICY is set to AUTO , all three of the new features are

enabled. Statements will be evaluated for parallel execution regardless of any

parallel decoration at the object level.

ADAPTIVE (12c) : This new 12c parameter enables the same functionality as the

previously discussed AUTO value. In addition to these, Oracle may re-evaluate

the statement in order to provide a better degree of parallelism for subsequent

executions based on feedback gathered during statement execution.

Although the only documented way to enable Parallel Statement Queueing and In-Memory Parallel

Execution is via the all-or-nothing setting of AUTO or ADAPTIVE , the developers have thoughtfully provided hidden parameters that provide independent control of these features. Table 6-3 shows the parameters and how the settings of PARALLEL_DEGREE_POLICY alter the hidden parameters.

When to Use It

仅仅在两种情况下考虑用并行

1. First, you can use it when plenty of free resources (CPU, memory, and disk I/O bandwidth) are available.

2. only SQL statements that take more than a few minutes, or even longer, are good candidates for being executed in parallel.

Parallel processing should be used only when two conditions are met. First, you can use it when plenty of free resources (CPU, memory, and disk I/O bandwidth) are available. Remember, the aim of parallel processing is to reduce the response time by distributing the work usually done by a single process (and hence a single CPU core) to several processes (and hence several CPU cores). Second, you can use it for SQL statements that take more than a dozen seconds to execute serially; otherwise, the time and resources needed to initialize, coordinate and terminate

the parallel environment (mainly, the slave processes and the table queues) might be higher than the time gained by the parallelization itself. The actual limit depends on the amount of resources that are available. Therefore, in some situations, only SQL statements that take more than a few minutes, or even longer, are good candidates for being executed in parallel. It’s important to stress that if these two conditions aren’t met, performance could decrease instead of increase.

If parallel processing is commonly used for many SQL statements, either automatic degree of parallelism is enabled at the system level or manual degree of parallelism is enabled at the segment levels. Otherwise, if it’s used only for specific batches or reports, it’s usually better to enable it at the session level or through hints.

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/14377/viewspace-2305558/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/14377/viewspace-2305558/