Oracle merge into 基于大数据量的优化方向

最新推荐文章于 2024-01-15 11:20:46 发布

大树的困惑

最新推荐文章于 2024-01-15 11:20:46 发布

阅读量4.5k

点赞数 5

分类专栏： Oracle 文章标签： oracle sql 数据库

本文链接：https://blog.csdn.net/weixin_44445168/article/details/115112429

版权

Oracle 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

Oracle merge into 基于大数据量的优化方向

针对Oracle数据库的更新操作我们会使用

update tableName set col1=col2 where condition...

这样子的事务提交效率比较低下，所以大部分情况下需要批量更新的话我们会使用merge into 进行替换

merge into target_tableName t
using (
	select id , col1, col2 ... from source_tableName
) a
on (t.id= a.id)
when matched then 
update set
t.col1=a.col1 ,
t.col2=a.col2 ,
...
when not matched then 
 insert
        (t.col1,
         t.col2,
         t.col3,
         ...
         )
      values
        (a.col1,
         a.col2,
         a.col3,
         ...);

常规的merge into 的应用如上，对于日常的数据量是基本符合要求的

这里要讨论面对大数据量时的优化方向

1.存储优化

merge into 的执行过程会load 两部分数据，一部分是需要更新的目标表，还有用于比对数据的using中的临时表，而using 中载入的表数据集需要等到merge into 执行完之后才会释放，而如果你using中包含了多张表的关联逻辑

如：

merge into target_TableName
using(
	select a.col1 from tablea a 
    	left join tableb b on a.id=b.id
   		left join tablec c on a.id=c.id
)....

如上逻辑，会导致using过程需要加载tablea,tableb,tablec 三张表的数据到一块临时表空间中，这样会导致内存的压力上升（针对大数据量下，小数据量可以忽略）

针对这种情况可以使用oracle提供的临时表进行优化

如：

drop table TMP_table_a;
    create global temporary table TMP_table_a (
    	col1 varchar2(64),
        col2 varchar2(64),
        col3 varchar2(64),
        col4 varchar2(64),
        col5 varchar2(64)
)
on commit delete rows;
--此临时表的功效是在事务进行提交之后，会删除掉表中所有的行

借助临时表的功能，可以提前将关联逻辑后的结果插入到临时表，merge into 之取临时表就可以了

insert into TMP_table_a values(col1 , col2 ...)
select col1,col2 .. from tablea a left join tableb b on a.id=b.id
....conditions

--在进行merge into
merge into target_TableName t
using(
	select col1... from TMP_table_a ..
) a
on (a.col1=t.col1)
when matched then 
update set 
t.col1 = a.col1 
when not matched then 
 insert
        (t.col1,
         t.col2,
         t.col3,
         ...
         )
      values
        (a.col1,
         a.col2,
         a.col3,
         ...);

2.减少对比数据量

减少对比数据量对于业务需求的理解能力要求相对高点，需要找到能够过过滤的条件，

比如对于每天增量的表就可以通过控制日期

（结合上面的存储优化，可以选择在插入临时表的时候就进行优化）

insert into TMP_table_a values(col1 , col2 ...)
select col1,col2 .. from tablea a left join tableb b on a.id=b.id
....where lastupdatedate>=end_date and conditions..
--lastupdatedate>=end_date

3.优化执行顺序

正常的使用merge into 的话，通常会使用

when matched then ...
when not matched then ...
--用于区分采取insert into 操作还是 update 操作

这样做有个比较大的弊端，merge into 它是通过对比数据来选择操作的，会从目标表中每一条数据和你的逻辑结果集（using 中的结果）中的数据进行比对，如果你的结果集过大，那么就会在

on (…)

这块数据比对逻辑这里耗费比较多的时间，而update set 和 insert into 是分开的，也就是说它会比对两次数据，所以在实际中，为了提高性能支持，会将insert into 从merge into 中抽离出来，merge into 只做更新操作

--merge into 只做更新
merge into target_tableName 
using(select col1... from TMP_table_a ..
) a
on (a.col1=t.col1)
when matched then 
update set 
t.col1 = a.col1  ;

--后续插入操作
insert into target_tableName (col1..) 
select col1 from TMP_table_a 
where conditions ...

补充：

在将insert into 从merge into 中抽离出来后，还涉及到一个执行顺序的问题

也就是表的初始化，一个新表进行初始化的时候，表是空的，所以优先执行merge into 是非常快的，因为里面没有任何数据需要更新，然后就直接走插入数据的流程

如果先执行insert into 的话，导致表内存在数据了，进行merge into 进行update的话，有需要比对数据

以上是对Oracle Merge into 操作进行优化的一些方向和建议

大树的困惑

关注

5
点赞
踩
38

收藏

觉得还不错? 一键收藏
2
评论
Oracle merge into 基于大数据量的优化方向

Oracle merge into 基于大数据量的优化方向针对Oracle数据库的更新操作我们会使用update tableName set col1=col2 where condition...这样子的事务提交效率比较低下，所以大部分情况下需要批量更新的话我们会使用merge into 进行替换merge into target_tableName tusing ( select id , col1, col2 ... from source_tableName) aon (t.id=
复制链接

扫一扫