针对批量插入如何提高性能的问题进行的测试过程

最新推荐文章于 2022-01-20 16:37:38 发布

daimin1983

最新推荐文章于 2022-01-20 16:37:38 发布

阅读量1.5k

点赞数

分类专栏： Oracle 文章标签：测试 parallel table sql insert 存储

本文链接：https://blog.csdn.net/daimin1983/article/details/3262342

版权

Oracle 专栏收录该内容

61 篇文章 1 订阅

订阅专栏

批量插入如何提高性能？

原条件要求如下：
1）原表：100万条记录；
2）一次插入10万条记录；
3）原表中并且有索引存在；
问题：
1）此时要求提高插入的性能，如何快速完成一次性的插入？
2）如果一次性插入50万条记录情况下，如何快速完成一次性的插入？

1、为了使测试就会更有针对性，首先思考影响批量插入性能的可能因素有哪些？
然后分别针对这些可能的因素，一一证明这些因素是否真的影响到批量插入的性能，
并分析哪些因素是影响批量插入的主要因素？
/* wwj：如果你一开始就能猜到测试想证明什么问题，测试就会更有针对性*/

讨论出来影响批量插入性能的可能因素有：
记录数据量：原表：100万条记录，一次插入10万条记录
1)、有无append
2)、nologging（对表以及对索引）
3)、索引的个数
4)、parallel(aa,10)（对表以及对索引）
5)、实时更新索引和重建索引因素
6)、分区表和分区索引
7)、oracle的版本
--ADO往SP中传数组的方法
8) 一次插入50万条记录的话，以上情况又是如何？

2、测试过程：
测试的环境：
数据库版本：oracle 10g
虚拟机内存大小：512M 一个内存

测试步骤：
记录数据量：原表：100万条记录，一次插入10万条记录
a) 创建表：
drop table aa;
create table aa as
select * from all_col_comments;

b) 导入100万条记录
begin
for i in 1..24 loop
insert into aa
select * from all_col_comments;
end loop;
end ;

SQL> select count(*) from aa;
COUNT(*)
----------
   1037746

c)、给新建的表的前三个字段创建复合索引
-- Create/Recreate indexes
--drop index aa_index
create index aa_index on AA (owner, table_name, column_name);

d)、建立大约10万条记录的表
drop table temp
create table temp
as
select * from all_col_comments；

begin
for i in 1..2 loop
insert into temp
select * from all_col_comments;
end loop;
end;

SQL> select count(*) from temp;
COUNT(*)
----------
124538

e)、创建一个一次插入10万记录的insert存储过程
create or replace procedure insertaa is
begin
insert into aa
select * from temp;
end insertaa;

f)、调用存储过程
SQL> begin
insertaa();
end; 2 3
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:02:30.86

---------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------------------------------------------
测试影响因素：
1)、比较添加了APPEND注释的insert存储过程
create or replace procedure insertaa1 is
begin
insert /*+ APPEND */ into aa
select * from temp;
end insertaa1;

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:57.13

--------------------------------------------------------------------------------------------------------------------------------------------

2）nologging（对表以及对索引）
修改aa表的属性，使得nologging对表aa
alter table aa nologging;

SQL> begin
2 insertaa();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:23.69
分析比较：没有添加APPEND的存储过程时间：Elapsed: 00:02:30.86
没有添加APPEND的存储过程在执行alter table aa nologging之后需要的时间：Elapsed: 00:00:23.69

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.33
分析比较：没有添加APPEND的存储过程时间：Elapsed: 00:02:30.86
执行添加APPEND的存储过程时间：Elapsed: 00:00:57.13
添加APPEND的存储过程在执行alter table aa nologging之后需要的时间Elapsed: 00:00:08.33

修改aa表中索引的属性，使得nologging对索引aa_index
alter index aa_index nologging;

SQL> begin
2 insertaa();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:17.83

分析比较：没有添加APPEND的存储过程时间：Elapsed: 00:02:30.86
没有添加APPEND的存储过程在执行alter table aa nologging之后需要的时间：Elapsed: 00:00:23.69
没有添加APPEND的存储过程在继续执行alter index aa_index nologging之后需要的时间：Elapsed: 00:00:17.83

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.32

分析比较：没有添加APPEND的存储过程时间：Elapsed: 00:02:30.86
          执行添加APPEND的存储过程时间：Elapsed: 00:00:57.13
          添加APPEND的存储过程在执行alter table aa nologging之后需要的时间Elapsed: 00:00:08.33
          添加APPEND的存储过程在继续执行alter index aa_index nologging之后需要的时间：Elapsed: 00:00:08.32



建立存储过程如下：
create or replace procedure insertaa2 is
begin
insert /*+ APPEND nologging*/ into aa
select * from temp;
end insertaa1;

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.13

SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:07.04

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.75

SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.43

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.07

SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.99

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.11

APPEND hint的学习：
APPEND When the APPEND hint is used with the INSERT statement, data is appended to the table. Existing free space in the block is not used.
If a table or an index is specified with nologging, this hint applied with an insert statement produces a direct path insert which reduces
generation of redo.
分析:由于测试中aa表以及aa_index索引都设置为nologging状态，所以使用了APPEND hint之后，是不会产生redo的，存储过程加与不加nologging提示无所谓的，
     并且在oracle提示中nologging提示是不存在的。

--------------------------------------------------------------------------------------------------------------------------------------------------

3)、索引的个数
SQL> alter table aa nologging;
Table altered.

SQL> alter index aa_index nologging;
Index altered.

在nologging table以及nologging index条件下，并且使用APPEND提示，不多加索引时所测试的情况如下：
SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:51.60

在nologging table以及nologging index条件下，并且使用APPEND提示，多创建两个索引的测试情况如下：
create index aa_index1 on AA (table_name, column_name);
create index aa_index2 on AA (column_name);

SQL> begin
2 insertaa1();
3 end;
4 /

PL/SQL procedure successfully completed.
Elapsed: 00:02:29.57

分析：条件都是在nologging table以及nologging index条件下，并且使用APPEND提示，在相同的条件下，增加索引的个数会影响的批量插入的性能，
      使批量插入性能下降；
      如何才能既能在多个索引情况下，又能够提高性能呢？
      在后面会测试下实时更新索引和重建索引之间的性能比较；

---------------------------------------------------------------------------------------------------------------------------------------------------
4)、parallel(aa,10)（对表以及对索引）
要使parallel并行提示有效，必须要执行ALTER SESSION { ENABLE | FORCE } PARALLEL DML，此时加PARALLEL提示才有作用；
新建存储过程，加APPEND以及parallel提示；
create or replace procedure insertaa2 is
begin
insert /*+ APPEND parallel(aa,10)*/ into aa
select * from temp;
end insertaa2;

不加parallel提示的情况下测试如下：（条件都是在nologging table以及nologging index条件下，并且使用APPEND提示，并且还是在只有一个索引的情况下）
SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:01:06.19

加parallel提示的情况下测试如下：（条件都是在nologging table以及nologging index条件下，并且使用APPEND提示，并且还是在只有一个索引的情况下）
SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.

SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:44.53

再执行一次不加parallel提示的情况：
SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.59

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:16.19

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:06.80

再加parallel提示的情况：
SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:19.94

SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:13.69

SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:19.45

分析：parallel(aa,10)提示并没有完全提高性能

将parallel(aa,10)改为parallel(aa,3)又会如何？
不加parallel提示的情况下测试如下：
SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.30

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:05.53

加parallel提示的情况下测试如下：
SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:15.87

SQL> ALTER SESSION ENABLE PARALLEL DML;
Session altered.
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:10.55

分析：parallel(aa,5)提示并没有完全提高性能
因为parallel是多线程实现，线程切换和管理需要时间，而这些线程在单CPU下事实上是串行的，
对不分区的表进行parallel插入的话，需要分配多个临时表，先同时对多个临时表进行插入，然后再合并，
对多个临时表的插入也是使用多线程实现，是串行执行的。

5)、实时更新索引和重建索引因素
对表多建几个索引，有什么变化？
多创建两个索引
create index aa_index1 on AA (table_name, column_name);
create index aa_index2 on AA (column_name);

比较rebuild＋insert（无索引或索引实效）和insert（有索引）测试(多个索引)的几种情况的测试：
1、rebuild＋insert（无索引）情况下：
--做插入操作的存储过程定义,包含新建索引
create or replace procedure insertaa2 is
begin
insert /*+ APPEND nologging parallel(aa,10)*/ into aa
select * from temp;
EXECUTE IMMEDIATE 'create index aa_index on aa(owner, table_name, column_name)';   --新建索引
EXECUTE IMMEDIATE 'create index aa_index1 on AA (table_name, column_name)';   --新建索引
EXECUTE IMMEDIATE 'create index aa_index2 on AA (column_name)';   --新建索引
end insertaa2;

无索引测试步骤：
1)、建表aa，数据量115万条左右；
2)、调用存储过程插入10万条记录，并且即时创建索引；
SQL> begin
2 insertaa2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:03:36.41

比较：只有aa_index一个索引的情况下所需时间为：Elapsed: 00:01:30.47
分析：索引个数的增加使得所消耗的时间近三倍，时间的增长幅度比较的大

2、rebuild＋insert（索引实效）情况下：

--做插入操作的存储过程定义,包含rebuild索引
create or replace procedure insertaa is
begin
insert /*+ APPEND nologging parallel(aa,10)*/ into aa
select * from temp;
EXECUTE IMMEDIATE 'alter index aa_index rebuild nologging';   --rebuild索引
EXECUTE IMMEDIATE 'alter index aa_index1 rebuild nologging';   --rebuild索引
EXECUTE IMMEDIATE 'alter index aa_index2 rebuild nologging';   --rebuild索引
end insertaa;

索引实效的测试步骤：
1)、建表aa，数据量115万条左右；
2)、创建索引
   create index aa_index on AA (owner, table_name, column_name);
   create index aa_index1 on AA (table_name, column_name);
   create index aa_index2 on AA (column_name);

3)、使索引失效
   alter index aa_index unusable;
   alter index aa_index1 unusable;
   alter index aa_index2 unusable;

SQL> select status
2 from all_indexes a
3 where a.owner=upper('daimin')
4 and a.index_name=upper('aa_index');
STATUS
--------
UNUSABLE --aa_index索引失效

SQL> select status
2 from all_indexes a
3 where a.owner=upper('daimin')
4 and a.index_name=upper('aa_index1');
STATUS
--------
UNUSABLE --aa_index1索引失效

SQL> select status
2 from all_indexes a
3 where a.owner=upper('daimin')
4 and a.index_name=upper('aa_index2');
STATUS
--------
UNUSABLE --aa_index2索引失效

4)、调用存储过程插入10万条记录，并且即时rebuild索引；
SQL> begin
2 insertaa();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:41.00

比较：只有aa_index一个索引的情况下所需时间为： Elapsed: 00:00:26.15
分析：索引个数的使得所消耗的时间增加幅度不是很大

2、insert（有索引）情况下：
--做插入操作的存储过程定义
create or replace procedure insertaa1 is
begin
insert /*+ APPEND nologging parallel(aa,10)*/ into aa
select * from temp;
end insertaa1;

有索引的测试步骤：
1)、建表aa，数据量115万条左右；
2)、创建索引
   create index aa_index on AA (owner, table_name, column_name);
   create index aa_index1 on AA (table_name, column_name);
   create index aa_index2 on AA (column_name);

3)、调用存储过程插入10万条记录
SQL> begin
2 insertaa1();
3 end; 4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:28.49

比较：只有aa_index一个索引的情况下所需时间为：Elapsed: 00:00:15.94
分析：索引个数的使得所消耗的时间增加幅度不是很大

结论：还是insert（有索引）情况下快

6)、分区表和分区索引
wwj 说:
insert其他分区，也可以比对一下rebuild index
Joy Dai 说:
你是说让数据分散，插入到不同的分区？
wwj 说:
相同分区
wwj 说:
是有数据的
Joy Dai 说:
哦，你的意思是指不要插入到最后一个分区？
Joy Dai 说:
插入到中间的分区？
wwj 说:
是

1)、分区表和分区索引，插入到相同分区测试
create table temp
as
select * from all_col_comments
where rownum<1;

begin
for i in 1..3 loop
insert into temp
select * from all_col_comments;
end loop;
end;

SQL> select count(*) from temp;
COUNT(*)
----------
124538

SQL> drop table aa;
Table dropped.

SQL> create table aa as
2 select rownum as r,a.* from all_col_comments a;
Table created..

SQL> begin
2 for i in 1..24 loop
3 insert into aa
4 select 41554*i+rownum as r,a.* from all_col_comments a;
5 end loop;
6 end ;
7 /
PL/SQL procedure successfully completed.

SQL> select count(*) from aa;
COUNT(*)
----------
1039775

SQL> select max(r) from aa;
MAX(R)
----------
1038886

设置aa表nologging属性
SQL> alter table aa nologging;
Table altered.

新建索引
SQL> create index aa_index on aa(owner, table_name, column_name);
Index created.

创建分区表
-- Create table
drop table BBp;

SQL> create table BBp
2 PARTITION BY RANGE ( r )
3 (
4 PARTITION quarter_one VALUES LESS THAN ( 200000),
5 PARTITION quarter_two VALUES LESS THAN ( 400000),
6 PARTITION quarter_three VALUES LESS THAN ( 600000),
7 PARTITION quarter_four VALUES LESS THAN ( 800000),
8 PARTITION quarter_five VALUES LESS THAN ( 1000000),
9 PARTITION quarter_six VALUES LESS THAN (MAXVALUE)
10 )
11 as
12 select * from aa;
Table created.

创建分区索引
SQL> create index BBp_index on BBp (owner, table_name, column_name) LOCAL;
Index created.

设置BBp表nologging属性
SQL> alter table BBp nologging;
Table altered.

SQL> select count(*) from BBp;
COUNT(*)
----------
1039775

针对不同的想法，创建不同的存储过程进行测试：
--将10多条记录插入到不是分区表的表中去的存储过程
create or replace procedure insertaa1 is
begin
insert /*+ APPEND */ into aa
select 1039775+rownum as r ,t.* from temp t;
end insertaa1;

--因为124538<200000,该插入操作是插入到第一个分区quarter_1中（插入到带有数据的分区中去）
create or replace procedure insertBB1 is
begin
insert /*+ APPEND */ into BBp
select rownum as r ,t.* from temp t;
end insertBB1;

--因为1000000+124538>1000000,该插入操作是将数据插入到最后一个几乎没有数据的分区quarter_13中去
create or replace procedure insertBB2 is
begin
insert /*+ APPEND */ into BBp
select 1000000+rownum as r ,t.* from temp t;
end insertBB2;

-- temp表中的数据被分散插入到不同分区
create or replace procedure insertBB3 is
begin
insert /*+ APPEND */ into BBp
select trunc(dbms_random.value(1, 1039775))+rownum as r ,t.* from temp t;
end insertBB3;

调用存储过程：
SQL> begin
2 insertaa1(); --将10多条记录插入到不是分区表的表中
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:02:36.77

SQL> begin
2 insertbb1(); --所有数据插入有数据的相同分区
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:58.61

SQL> begin
2 insertbb2(); --所有数据插入到一个接近空的分区
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:48.33

SQL> begin
2 insertbb3(); --所有数据分散插入到不同分区
3 end;
4 /
PL/SQL procedure successfully completed.

Elapsed: 00:03:11.01

比较：将数据批量插入到一个接近空的分区速度最快，将数据批量分散插入到不同分区速度最慢

2)对比先使索引失效，然后rebuild index的方法

针对不同的想法，创建不同的存储过程进行测试：
使索引失效
alter index aa_index unusable;

SQL> select status
2 from all_indexes a
3 where a.owner=upper('daimin')
4 and a.index_name=upper('aa_index');
STATUS
----------------
UNUSABLE

--将10多条记录插入到不是分区表的表中去的存储过程
create or replace procedure insertaa1 is
begin
insert /*+ APPEND */ into aa
select 1039775+rownum as r ,t.* from temp t;
EXECUTE IMMEDIATE 'alter index aa_index rebuild nologging'; --rebuild索引aa_index
end insertaa1;

SQL> begin
2 insertaa1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:01:17.25

使分区索引失效
ALTER INDEX BBp_index MODIFY PARTITION quarter_one UNUSABLE;

--因为124538<200000,该插入操作是插入到第一个分区quarter_one中（插入到带有数据的分区中去）
create or replace procedure insertBB1 is
begin
insert /*+ APPEND */ into BBp
select rownum as r ,t.* from temp t;
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_one nologging'; --rebuild quarter_1分区的索引BBp_index
end insertBB1;

SQL> begin
2 insertbb1();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:26.15

使分区索引失效
ALTER INDEX BBp_index MODIFY PARTITION quarter_six UNUSABLE;

--因为1000000+124538>1000000,该插入操作是将数据插入到最后一个几乎没有数据的分区quarter_six中去
create or replace procedure insertBB2 is
begin
insert /*+ APPEND */ into BBp
select 1000000+rownum as r ,t.* from temp t;
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_six nologging'; --rebuild quarter_six分区的索引BBp_index
end insertBB2;

SQL> begin
2 insertbb2();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:00:08.78

使索引失效
alter index BBp_index unusable;

SQL> begin
2 insertbb3();
3 end;
4 /
begin
*
ERROR at line 1:
ORA-14086: a partitioned index may not be rebuilt as a whole
ORA-06512: at "DAIMIN.INSERTBB3", line 5
ORA-06512: at line 2

然后修改存储过程如下：
-- temp表中的数据被分散插入到不同分区
create or replace procedure insertBB3 is
begin
insert /*+ APPEND */ into BBp
select trunc(dbms_random.value(1, 1039775))+rownum as r ,t.* from temp t;
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_one nologging';
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_two nologging';
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_three nologging';
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_four nologging';
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_five nologging';
EXECUTE IMMEDIATE 'alter index BBp_index rebuild partition quarter_six nologging';
end insertBB3;

SQL> begin
2 insertBB3();
3 end;
4 /
PL/SQL procedure successfully completed.
Elapsed: 00:01:28.62

比较：使用先使索引失效，然后rebuild index的方法都比实时更新索引的批量插入的性能好，并且在使用rebuild index的方法中仍然是
将数据批量插入到一个接近空的分区速度最快，将数据批量分散插入到不同分区速度最慢

daimin1983

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
针对批量插入如何提高性能的问题进行的测试过程

批量插入如何提高性能？原条件要求如下：1）原表：100万条记录；2）一次插入10万条记录；3）原表中并且有索引存在；问题：1）此时要求提高插入的性能，如何快速完成一次性的插入？2）如果一次性插入50万条记录情况下，如何快速完成一次性的插入？1、为了使测试就会更有针对性，首先思考影响批量插入性能的可能因素有哪些？然后分别针对这些可能的因素，一一证明这些因素是否真的影响到批量插入的性能
复制链接

扫一扫