问题:
如何在尽量短的时间内完成大批量数据的更新?如漂白数据时,需要将客户表(约一亿行)中的手机号设置为常量。
比如:投产时数据迁移
可以考虑的方法:
1,并行dml
一个事务,需要较大的回滚表空间
进度难以预测
并行dml bug较多
SQL> alter session enable parallel dml;
Session altered
Executed in 0.047 seconds
SQL> update /*+ parallel(e 20) */ CUST e SET e.mobile = '123';
107007800 rows updated
Executed in 183.285 seconds
SQL> commit;
Commit complete
Executed in 5.179 seconds
SQL> alter session disable parallel dml;
Session altered
Executed in 0.063 seconds
SID TYPE ID1 LMODE REQUEST BLOCK
---------- ---- ---------- ---------- ---------- ----------
109 TM 99947 1 0 0
109 TM 99947 4 0 0
154 TM 99947 1 0 0
154 TM 99947 4 0 0
166 TM 99947 6 0 1 --发起并行更新的会话
169 TM 99947 0 3 0 --对该表dml操作,被阻塞
181 TM 99947 1 0 0
181 TM 99947 4 0 0
184 TM 99947 1 0 0
184 TM 99947 4 0 0
187 TM 99947 1 0 0
187 TM 99947 4 0 0
190 TM 99947 1 0 0
190 TM 99947 4 0 0
193 TM 99947 1 0 0
193 TM 99947 4 0 0
196 TM 99947 1 0 0
196 TM 99947 4 0 0
199 TM 99947 1 0 0
199 TM 99947 4 0 0
202 TM 99947 1 0 0
202 TM 99947 4 0 0
205 TM 99947 1 0 0
205 TM 99947 4 0 0
208 TM 99947 1 0 0
208 TM 99947 4 0 0
211 TM 99947 1 0 0
211 TM 99947 4 0 0
214 TM 99947 1 0 0
214 TM 99947 4 0 0
217 TM 99947 1 0 0
217 TM 99947 4 0 0
220 TM 99947 1 0 0
220 TM 99947 4 0 0
223 TM 99947 1 0 0
223 TM 99947 4 0 0
226 TM 99947 1 0 0
226 TM 99947 4 0 0
229 TM 99947 1 0 0
229 TM 99947 4 0 0
232 TM 99947 1 0 0
232 TM 99947 4 0 0
42 rows selected
2,应用并行
灵活
步骤复杂
主要步骤:
*分片(chunk)
*多个分片并行更新
版本:
select * from v$version;
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
"CORE 11.2.0.3.0 Production"
TNS for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
软硬件环境:
SQL> select stat_name, value
2 from v$osstat
3 where stat_name in ('NUM_CPU_CORES', 'NUM_CPUS', 'PHYSICAL_MEMORY_BYTES');
STAT_NAME VALUE
---------------------------------------------------------------- ----------
NUM_CPUS 96
NUM_CPU_CORES 24
PHYSICAL_MEMORY_BYTES 7730941132
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
lock_sga boolean FALSE
pre_page_sga boolean FALSE
sga_max_size big integer 22144M
sga_target big integer 22144M
SQL> show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 7372M
SQL> show parameter cluster
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cluster_database boolean FALSE
cluster_database_instances integer 1
cluster_interconnects
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1
SQL>
SQL> select bytes/1e9,autoextensiable from dba_data_files where tablespace_name='UNDOTBS1';
BYTES/1E9 AUTOEXT
--------------------
8.55113728 YES
测试表:
SQL> select bytes/1e9 from user_segments where segment_name='CUST';
BYTES/1E9
----------
14.4388915
SQL> select /*+ parallel(t 8) */ count(*) from cust t;
COUNT(*)
----------
107007800
分片并行更新:
DECLARE
l_sql_stmt VARCHAR2(1000);
l_try NUMBER;
l_status NUMBER;
BEGIN
-- Create the TASK
DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask');
-- Chunk the table by ROWID
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('mytask', 'BOCNET', 'CUST', false, 1000);
-- Execute the DML in parallel
l_sql_stmt := 'update /*+ ROWID (e) */ CUST e
SET e.mobile = ''123''
WHERE rowid BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask', l_sql_stmt, DBMS_SQL.NATIVE,
parallel_level => 20);
-- If there is an error, RESUME it for at most 2 times.
L_try := 0;
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
WHILE(l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED)
LOOP
L_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK('mytask');
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
END LOOP;
-- Done with processing; drop the task
DBMS_PARALLEL_EXECUTE.DROP_TASK('mytask');
END;
/
==>
PL/SQL procedure successfully completed
Executed in 391.235 seconds
在执行期间,在第二个会话中执行:
SQL>
SQL> select event,count(*) from v$session where wait_class<>'Idle'
2 group by event;
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file scattered read 9
log file parallel write 1
db file parallel write 10
control file sequential read 1
log buffer space 11
SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 4
latch: redo allocation 15
LGWR wait for redo copy 1
latch: object queue header operation 1
SQL> set time on
10:58:55 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file scattered read 3
latch: redo allocation 17
latch free 1
10:58:57 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
log file switch completion 19
log file parallel write 1
db file parallel write 3
log buffer space 1
11:00:19 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
latch: undo global data 1
db file scattered read 3
enq: HW - contention 3
Disk file operations I/O 1
enq: US - contention 4
latch: redo allocation 1
log file parallel write 1
db file parallel write 1
control file sequential read 1
buffer busy waits 6
10 rows selected
11:01:39 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
latch: cache buffers lru chain 4
db file scattered read 8
latch: redo allocation 6
Disk file operations I/O 2
LGWR wait for redo copy 1
11:02:13 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
ASM file metadata operation 1
db file scattered read 6
enq: US - contention 2
enq: HW - contention 2
log file parallel write 1
buffer busy waits 9
6 rows selected
11:02:57 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
enq: HW - contention 1
log file parallel write 1
undo segment extension 2
log file switch (checkpoint incomplete) 10
buffer busy waits 7
11:03:19 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 2
enq: US - contention 5
enq: HW - contention 1
log file parallel write 1
db file parallel write 2
control file sequential read 1
undo segment extension 1
buffer busy waits 9
9 rows selected
11:03:46 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 5
enq: US - contention 6
latch: redo allocation 2
enq: HW - contention 1
log file parallel write 1
Data file init write 1
buffer busy waits 4
8 rows selected
11:04:42 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
在执行期间,在第三个会话中执行:
SQL> select sum(used_ublk) from v$transaction;
SUM(USED_UBLK)
--------------
7938
SQL> r
SUM(USED_UBLK)
--------------
6875
SQL> set time on
10:59:03 SQL> r
SUM(USED_UBLK)
--------------
8027
11:00:15 SQL> r
SUM(USED_UBLK)
--------------
8413
11:00:46 SQL> r
SUM(USED_UBLK)
--------------
13125
11:00:50 SQL> select count(*),sum(used_ublk) from v$transaction;
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 10701
11:01:35 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 9650
11:03:15 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
21 5391
11:03:42 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 6826
11:04:37 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
0
在执行期间,在第四个会话中执行:
SQL> select status,count(*) from user_parallel_execute_chunks where task_name='mytask' group by status;
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 19
UNASSIGNED 1635
PROCESSED 515
SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 1475
PROCESSED 674
SQL> set time on
10:59:10 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 1031
PROCESSED 1118
10:59:47 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 825
PROCESSED 1324
11:00:10 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 796
PROCESSED 1353
11:03:12 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 279
PROCESSED 1870
11:03:32 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 18
UNASSIGNED 219
PROCESSED 1932
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 61
PROCESSED 2088
11:04:52 SQL> r
STATUS COUNT(*)
-------------------- ----------
11:06:01 SQL>
如何在尽量短的时间内完成大批量数据的更新?如漂白数据时,需要将客户表(约一亿行)中的手机号设置为常量。
比如:投产时数据迁移
可以考虑的方法:
1,并行dml
一个事务,需要较大的回滚表空间
进度难以预测
并行dml bug较多
SQL> alter session enable parallel dml;
Session altered
Executed in 0.047 seconds
SQL> update /*+ parallel(e 20) */ CUST e SET e.mobile = '123';
107007800 rows updated
Executed in 183.285 seconds
SQL> commit;
Commit complete
Executed in 5.179 seconds
SQL> alter session disable parallel dml;
Session altered
Executed in 0.063 seconds
SID TYPE ID1 LMODE REQUEST BLOCK
---------- ---- ---------- ---------- ---------- ----------
109 TM 99947 1 0 0
109 TM 99947 4 0 0
154 TM 99947 1 0 0
154 TM 99947 4 0 0
166 TM 99947 6 0 1 --发起并行更新的会话
169 TM 99947 0 3 0 --对该表dml操作,被阻塞
181 TM 99947 1 0 0
181 TM 99947 4 0 0
184 TM 99947 1 0 0
184 TM 99947 4 0 0
187 TM 99947 1 0 0
187 TM 99947 4 0 0
190 TM 99947 1 0 0
190 TM 99947 4 0 0
193 TM 99947 1 0 0
193 TM 99947 4 0 0
196 TM 99947 1 0 0
196 TM 99947 4 0 0
199 TM 99947 1 0 0
199 TM 99947 4 0 0
202 TM 99947 1 0 0
202 TM 99947 4 0 0
205 TM 99947 1 0 0
205 TM 99947 4 0 0
208 TM 99947 1 0 0
208 TM 99947 4 0 0
211 TM 99947 1 0 0
211 TM 99947 4 0 0
214 TM 99947 1 0 0
214 TM 99947 4 0 0
217 TM 99947 1 0 0
217 TM 99947 4 0 0
220 TM 99947 1 0 0
220 TM 99947 4 0 0
223 TM 99947 1 0 0
223 TM 99947 4 0 0
226 TM 99947 1 0 0
226 TM 99947 4 0 0
229 TM 99947 1 0 0
229 TM 99947 4 0 0
232 TM 99947 1 0 0
232 TM 99947 4 0 0
42 rows selected
2,应用并行
灵活
步骤复杂
主要步骤:
*分片(chunk)
*多个分片并行更新
版本:
select * from v$version;
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
PL/SQL Release 11.2.0.3.0 - Production
"CORE 11.2.0.3.0 Production"
TNS for IBM/AIX RISC System/6000: Version 11.2.0.3.0 - Production
NLSRTL Version 11.2.0.3.0 - Production
软硬件环境:
SQL> select stat_name, value
2 from v$osstat
3 where stat_name in ('NUM_CPU_CORES', 'NUM_CPUS', 'PHYSICAL_MEMORY_BYTES');
STAT_NAME VALUE
---------------------------------------------------------------- ----------
NUM_CPUS 96
NUM_CPU_CORES 24
PHYSICAL_MEMORY_BYTES 7730941132
SQL> show parameter sga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
lock_sga boolean FALSE
pre_page_sga boolean FALSE
sga_max_size big integer 22144M
sga_target big integer 22144M
SQL> show parameter pga
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
pga_aggregate_target big integer 7372M
SQL> show parameter cluster
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
cluster_database boolean FALSE
cluster_database_instances integer 1
cluster_interconnects
SQL> show parameter undo
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
undo_management string AUTO
undo_retention integer 900
undo_tablespace string UNDOTBS1
SQL>
SQL> select bytes/1e9,autoextensiable from dba_data_files where tablespace_name='UNDOTBS1';
BYTES/1E9 AUTOEXT
--------------------
8.55113728 YES
测试表:
SQL> select bytes/1e9 from user_segments where segment_name='CUST';
BYTES/1E9
----------
14.4388915
SQL> select /*+ parallel(t 8) */ count(*) from cust t;
COUNT(*)
----------
107007800
分片并行更新:
DECLARE
l_sql_stmt VARCHAR2(1000);
l_try NUMBER;
l_status NUMBER;
BEGIN
-- Create the TASK
DBMS_PARALLEL_EXECUTE.CREATE_TASK ('mytask');
-- Chunk the table by ROWID
DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('mytask', 'BOCNET', 'CUST', false, 1000);
-- Execute the DML in parallel
l_sql_stmt := 'update /*+ ROWID (e) */ CUST e
SET e.mobile = ''123''
WHERE rowid BETWEEN :start_id AND :end_id';
DBMS_PARALLEL_EXECUTE.RUN_TASK('mytask', l_sql_stmt, DBMS_SQL.NATIVE,
parallel_level => 20);
-- If there is an error, RESUME it for at most 2 times.
L_try := 0;
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
WHILE(l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED)
LOOP
L_try := l_try + 1;
DBMS_PARALLEL_EXECUTE.RESUME_TASK('mytask');
L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('mytask');
END LOOP;
-- Done with processing; drop the task
DBMS_PARALLEL_EXECUTE.DROP_TASK('mytask');
END;
/
==>
PL/SQL procedure successfully completed
Executed in 391.235 seconds
在执行期间,在第二个会话中执行:
SQL>
SQL> select event,count(*) from v$session where wait_class<>'Idle'
2 group by event;
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file scattered read 9
log file parallel write 1
db file parallel write 10
control file sequential read 1
log buffer space 11
SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 4
latch: redo allocation 15
LGWR wait for redo copy 1
latch: object queue header operation 1
SQL> set time on
10:58:55 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file scattered read 3
latch: redo allocation 17
latch free 1
10:58:57 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
log file switch completion 19
log file parallel write 1
db file parallel write 3
log buffer space 1
11:00:19 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
latch: undo global data 1
db file scattered read 3
enq: HW - contention 3
Disk file operations I/O 1
enq: US - contention 4
latch: redo allocation 1
log file parallel write 1
db file parallel write 1
control file sequential read 1
buffer busy waits 6
10 rows selected
11:01:39 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
latch: cache buffers lru chain 4
db file scattered read 8
latch: redo allocation 6
Disk file operations I/O 2
LGWR wait for redo copy 1
11:02:13 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
ASM file metadata operation 1
db file scattered read 6
enq: US - contention 2
enq: HW - contention 2
log file parallel write 1
buffer busy waits 9
6 rows selected
11:02:57 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
enq: HW - contention 1
log file parallel write 1
undo segment extension 2
log file switch (checkpoint incomplete) 10
buffer busy waits 7
11:03:19 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 2
enq: US - contention 5
enq: HW - contention 1
log file parallel write 1
db file parallel write 2
control file sequential read 1
undo segment extension 1
buffer busy waits 9
9 rows selected
11:03:46 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
db file sequential read 1
db file scattered read 5
enq: US - contention 6
latch: redo allocation 2
enq: HW - contention 1
log file parallel write 1
Data file init write 1
buffer busy waits 4
8 rows selected
11:04:42 SQL> r
EVENT COUNT(*)
---------------------------------------------------------------- ----------
在执行期间,在第三个会话中执行:
SQL> select sum(used_ublk) from v$transaction;
SUM(USED_UBLK)
--------------
7938
SQL> r
SUM(USED_UBLK)
--------------
6875
SQL> set time on
10:59:03 SQL> r
SUM(USED_UBLK)
--------------
8027
11:00:15 SQL> r
SUM(USED_UBLK)
--------------
8413
11:00:46 SQL> r
SUM(USED_UBLK)
--------------
13125
11:00:50 SQL> select count(*),sum(used_ublk) from v$transaction;
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 10701
11:01:35 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 9650
11:03:15 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
21 5391
11:03:42 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
20 6826
11:04:37 SQL> r
COUNT(*) SUM(USED_UBLK)
---------- --------------
0
在执行期间,在第四个会话中执行:
SQL> select status,count(*) from user_parallel_execute_chunks where task_name='mytask' group by status;
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 19
UNASSIGNED 1635
PROCESSED 515
SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 1475
PROCESSED 674
SQL> set time on
10:59:10 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 1031
PROCESSED 1118
10:59:47 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 825
PROCESSED 1324
11:00:10 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 796
PROCESSED 1353
11:03:12 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 279
PROCESSED 1870
11:03:32 SQL> r
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 18
UNASSIGNED 219
PROCESSED 1932
STATUS COUNT(*)
-------------------- ----------
ASSIGNED 20
UNASSIGNED 61
PROCESSED 2088
11:04:52 SQL> r
STATUS COUNT(*)
-------------------- ----------
11:06:01 SQL>
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/18922393/viewspace-762381/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/18922393/viewspace-762381/