[20181129]大量的control file sequential read.txt
--//昨天看一个生产系统的awr报表,发现奇怪的现象,出现大量控制文件读取情况.
1.问题提出:
SYS@baaad1> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
--//摘要awr报表如下:
Top 10 Foreground Events by Total Wait Time
Event Waits Total Wait Time (sec) Wait Avg(ms) % DB time Wait Class
DB CPU 708 89.0
log file sync 20,868 110.3 5 13.9 Commit
control file sequential read 38,851 16.9 0 2.1 System I/O
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gc cr block 2-way 16,066 7.3 0 .9 Cluster
gc current block 2-way 13,385 5.7 0 .7 Cluster
SQL*Net more data to client 559,046 4.6 0 .6 Network
log file switch completion 4 3 757 .4 Configuration
gc current grant busy 7,577 2.9 0 .4 Cluster
db file sequential read 413 2.2 5 .3 User I/O
gc cr multi block request 2,417 1.8 1 .2 Cluster
--//control file sequential read占17秒根本不算什么.
...
IOStat by Filetype summary
'Data' columns suffixed with M,G,T,P are in multiples of 1024 other columns suffixed with K,M,G,T,P are in multiples of 1000
Small Read and Large Read are average service times, in milliseconds
Ordered by (Data Read + Write) desc
Filetype Name Reads: Data Reqs per sec Data per sec Writes: Data Reqs per sec Data per sec Small Read Large Read
Control File 7.6G 18.19 2.151M 239M 3.77 .066M 0.08 0.75
Log File 281M 0.08 .078M 291M 9.54 .081M 0.60 5.74
Data File 19M 0.70 .005M 280M 4.75 .078M 1.52
Archive Log 0M 0.10 0M 280M 0.08 .078M 0.05
TOTAL: 7.9G 19.08 2.235M 1.1G 18.14 .303M 0.14 0.90
--//真心说一下,数据库本身没有什么大问题,典型的大马拉小车(rac环境,内存充足,用户很少).但是奇怪的是为什么控制文件读如此之高(取样1个小时)
--//我看了其它时间段的awr报表,也是一样.为什么?做一些简单探究.
2.探究:
SYS@baaad1> @ ev_name 'control file sequential read'
EVENT# EVENT_ID NAME PARAMETER1 PARAMETER2 PARAMETER3 WAIT_CLASS_ID WAIT_CLASS# WAIT_CLASS
---------- ---------- ---------------------------------------- -------------------- -------------------- -------------------- ------------- ----------- --------------------
81 3213517201 control file sequential read file# block# blocks 4108307767 9 System I/O
# iostat 1 100
Linux 2.6.32-431.el6.x86_64 (baaad1) 11/29/2018 _x86_64_ (32 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.75 0.00 0.99 0.03 0.00 98.23
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
....
sdz 3.46 4.70 3.52 274297736 205429406
...
sdac 15.46 1635.83 37.78 95396336257 2203177194
sdad 12.17 1356.83 58.67 79125895811 3421575617
sdae 15.25 1803.30 58.63 105162260680 3419328124
...
--//可以发现主要集中在sda[cde]磁盘上.
# dstat -t -r -d -D sdac,sdad,sdae,total 1 100
----system---- --io/sdac-----io/sdad-----io/sdae-----io/total- --dsk/sdac----dsk/sdad----dsk/sdae---dsk/total-
date/time | read writ: read writ: read writ: read writ| read writ: read writ: read writ: read writ
29-11 09:23:21|13.4 2.08 :9.64 2.53 :12.7 2.58 :37.9 21.4 | 818k 19k: 678k 29k: 902k 29k:2402k 230k
29-11 09:23:22|4.00 9.00 :2.00 3.00 :3.00 4.00 :11.0 18.0 | 64k 25k: 32k 32k: 48k 40k: 145k 101k
29-11 09:23:23| 0 9.00 : 0 3.00 :1.00 1.00 :3.00 18.0 | 0 32k: 0 32k: 16k 16k: 17k 96k
29-11 09:23:24| 114 3.00 :74.0 1.00 :82.0 1.00 : 272 14.0 |7808k 3072B:6769k 16k:8416k 16k: 22M 412k
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-11 09:23:25|6.00 7.00 :4.00 3.00 :4.00 3.00 :16.0 19.0 | 96k 46k: 64k 48k: 64k 48k: 225k 163k
29-11 09:23:26|60.0 15.0 :50.0 1.00 :64.0 1.00 : 176 44.0 |4400k 21k:4097k 16k:5128k 16k: 13M 393k
29-11 09:23:27| 0 9.00 : 0 1.00 :2.00 1.00 :4.00 13.0 | 0 10k: 0 16k: 32k 16k: 33k 47k
29-11 09:23:28|12.0 9.00 :6.00 3.00 :7.00 3.00 :27.0 20.0 | 192k 71k: 96k 48k: 112k 48k: 401k 183k
29-11 09:23:29| 0 6.00 : 0 1.00 :2.00 1.00 :4.00 18.0 | 0 3584B: 0 16k: 32k 16k: 33k 140k
29-11 09:23:30| 0 2.00 :1.00 1.00 :2.00 1.00 :5.00 8.00 | 0 1536B: 16k 16k: 32k 16k: 49k 46k
29-11 09:23:31|6.00 8.00 :3.00 3.00 :3.00 3.00 :14.0 23.0 | 96k 47k: 48k 48k: 48k 48k: 193k 179k
29-11 09:23:32| 0 2.00 : 0 1.00 :2.00 1.00 :4.00 16.0 | 0 1536B: 0 16k: 32k 16k: 33k 302k
29-11 09:23:33|6.00 5.00 :3.00 1.00 :4.00 1.00 :15.0 9.00 | 96k 3072B: 48k 16k: 64k 16k: 209k 40k
29-11 09:23:34|10.0 15.0 :5.00 1.00 :6.00 1.00 :23.0 33.0 | 160k 53k: 80k 16k: 96k 16k: 337k 221k
29-11 09:23:35| 0 8.00 :1.00 2.00 :2.00 1.00 :5.00 13.0 | 0 5632B: 16k 24k: 32k 16k: 49k 50k
29-11 09:23:36| 0 6.00 : 0 1.00 :1.00 1.00 :3.00 15.0 | 0 11k: 0 16k: 16k 16k: 17k 68k
29-11 09:23:37|35.0 11.0 :11.0 4.00 :6.00 6.00 :54.0 32.0 | 560k 48k: 176k 56k: 96k 192k: 833k 573k
29-11 09:23:38|1.00 9.00 :1.00 4.00 :4.00 3.00 :8.00 20.0 | 16k 98k: 16k 48k: 80k 40k: 113k 199k
29-11 09:23:39| 0 6.00 : 0 1.00 :2.00 2.00 :4.00 18.0 | 0 28k: 0 16k: 16k 24k: 17k 129k
29-11 09:23:40|5.00 10.0 :3.00 1.00 :2.00 2.00 :12.0 18.0 | 80k 17k: 48k 16k: 32k 24k: 161k 74k
29-11 09:23:41| 114 5.00 :74.0 1.00 :81.0 1.00 : 271 27.0 |7808k 3072B:6769k 16k:8400k 16k: 22M 368k
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-11 09:23:42| 0 10.0 : 0 4.00 :2.00 1.00 :4.00 28.0 | 0 74k: 0 40k: 32k 16k: 33k 398k
29-11 09:23:43|66.0 21.0 :53.0 7.00 :66.0 8.00 : 187 42.0 |4496k 101k:4145k 80k:5168k 112k: 13M 314k
29-11 09:23:44|2.00 15.0 :1.00 4.00 :2.00 6.00 :7.00 238 | 32k 59k: 16k 40k: 32k 56k: 81k 1663k
29-11 09:23:45| 0 7.00 :1.00 1.00 :2.00 1.00 :5.00 14.0 | 0 5632B: 16k 16k: 32k 16k: 49k 54k
29-11 09:23:46|4.00 7.00 :2.00 1.00 :2.00 2.00 :10.0 90.0 | 64k 10k: 32k 16k: 32k 32k: 129k 539k
29-11 09:23:47| 0 2.00 : 0 1.00 :2.00 3.00 :4.00 18.0 | 0 1536B: 0 16k: 32k 24k: 33k 286k
29-11 09:23:48|2.00 7.00 :1.00 4.00 :4.00 3.00 :9.00 15.0 | 32k 66k: 16k 56k: 64k 48k: 113k 171k
29-11 09:23:49|5.00 10.0 :3.00 2.00 :4.00 2.00 :14.0 33.0 | 80k 72k: 48k 32k: 64k 32k: 193k 340k
29-11 09:23:50| 0 3.00 :1.00 1.00 :2.00 2.00 :5.00 7.00 | 0 2048B: 16k 16k: 32k 24k: 49k 43k
29-11 09:23:51| 0 6.00 : 0 1.00 :2.00 1.00 :4.00 16.0 | 0 3584B: 0 16k: 32k 16k: 33k 68k
29-11 09:23:52|4.00 8.00 :2.00 1.00 :2.00 1.00 :10.0 21.0 | 64k 8192B: 32k 16k: 32k 16k: 129k 337k
29-11 09:23:53| 0 8.00 : 0 1.00 :2.00 1.00 :4.00 15.0 | 0 8192B: 0 16k: 32k 16k: 33k 57k
29-11 09:23:54| 0 5.00 : 0 1.00 :1.00 2.00 :3.00 16.0 | 0 3072B: 0 16k: 16k 24k: 17k 116k
29-11 09:23:55|6.00 12.0 :4.00 7.00 :4.00 4.00 :16.0 29.0 | 96k 72k: 64k 80k: 64k 56k: 225k 228k
29-11 09:23:56| 0 9.00 : 0 5.00 :2.00 1.00 :4.00 18.0 | 0 22k: 0 48k: 32k 16k: 33k 122k
29-11 09:23:57| 0 9.00 : 0 1.00 :1.00 2.00 :3.00 23.0 | 0 5632B: 0 16k: 16k 24k: 17k 294k
29-11 09:23:58| 104 8.00 :77.0 3.00 : 101 3.00 : 284 19.0 |7648k 78k:6817k 48k:8720k 48k: 23M 190k
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
29-11 09:23:59| 0 4.00 : 0 2.00 :2.00 1.00 :4.00 9.00 | 0 2560B: 0 32k: 32k 16k: 33k 55k
29-11 09:24:00|60.0 7.00 :51.0 1.00 :63.0 1.00 : 176 24.0 |4400k 28k:4113k 16k:5120k 16k: 13M 184k
29-11 09:24:01|7.00 11.0 :3.00 3.00 :3.00 3.00 :15.0 31.0 | 112k 48k: 48k 48k: 48k 48k: 209k 273k
29-11 09:24:02| 0 6.00 : 0 1.00 :1.00 1.00 :3.00 16.0 | 0 3584B: 0 16k: 16k 16k: 17k 272k
29-11 09:24:03| 0 10.0 : 0 3.00 :2.00 5.00 :4.00 26.0 | 0 66k: 0 32k: 32k 48k: 33k 250k
29-11 09:24:04|5.00 9.00 :2.00 8.00 :2.00 4.00 :11.0 26.0 | 80k 46k: 32k 80k: 32k 40k: 145k 183k
29-11 09:24:05| 0 17.0 :1.00 8.00 :1.00 1.00 :4.00 35.0 | 0 64k: 16k 72k: 16k 16k: 33k 252k
29-11 09:24:06| 0 9.00 : 0 1.00 :2.00 1.00 :4.00 21.0 | 0 8704B: 0 16k: 32k 16k: 33k 133k
29-11 09:24:07|7.00 11.0 :3.00 8.00 :2.00 4.00 :14.0 35.0 | 112k 56k: 48k 88k: 32k 56k: 193k 457k
29-11 09:24:08|5.00 13.0 :2.00 5.00 :6.00 5.00 :15.0 27.0 | 80k 131k: 32k 80k: 96k 80k: 209k 304k
29-11 09:24:09| 0 7.00 : 0 1.00 :2.00 2.00 :4.00 12.0 | 0 12k: 0 16k: 32k 32k: 33k 64k
29-11 09:24:10|5.00 10.0 :3.00 2.00 :1.00 2.00 :11.0 23.0 | 80k 9216B: 48k 24k: 16k 24k: 145k 134k
29-11 09:24:11| 0 6.00 : 0 3.00 :2.00 1.00 :4.00 37.0 | 0 11k: 0 32k: 32k 16k: 33k 340k
29-11 09:24:12| 0 15.0 : 0 1.00 :2.00 1.00 :4.00 21.0 | 0 18k: 0 16k: 32k 16k: 33k 215k
29-11 09:24:13|7.00 6.00 :3.00 4.00 :2.00 3.00 :14.0 24.0 | 112k 38k: 48k 56k: 32k 48k: 193k 239k
29-11 09:24:14|2.00 8.00 :1.00 1.00 :3.00 1.00 :8.00 21.0 | 32k 30k: 16k 16k: 48k 16k: 97k 386k
29-11 09:24:15| 111 5.00 :75.0 1.00 :84.0 1.00 : 272 13.0 |7760k 3072B:6785k 16k:8448k 16k: 22M 56k
29-11 09:24:16|5.00 5.00 :2.00 1.00 :2.00 1.00 :11.0 11.0 | 80k 6656B: 32k 16k: 32k 16k: 145k 51k
29-11 09:24:17|68.0 6.00 :50.0 1.00 :56.0 1.00 : 176 22.0 |4528k 3584B:4097k 16k:5008k 16k: 13M 372k
29-11 09:24:18|3.00 8.00 :1.00 4.00 :2.00 4.00 :8.00 17.0 | 48k 36k: 16k 64k: 32k 64k: 97k 164k^C
--//可以看出大约17-18秒存在大量的读操作.22M之后,停顿1秒,又有13M的读操作.
--//awr报表1个小时3600秒,3600/18*(23+13) = 7623M,接近上面报表看到的情况.
--//使用iotop -a 排序磁盘读,结果如下:
Total DISK READ: 2.22 K/s | Total DISK WRITE: 28.89 K/s
TID PRIO USER DISK READ> DISK WRITE SWAPIN IO COMMAND
28909 be/4 oracle 5.34 M 0.00 B 0.00 % 0.04 % ora_lmon_baaad1
28946 be/4 oracle 4.48 M 6.22 M 0.00 % 0.06 % ora_ckpt_baaad1
28944 be/4 oracle 4.47 M 5.66 M 0.00 % 0.32 % ora_lgwr_baaad1
29476 be/4 oracle 1872.00 K 0.00 B 0.00 % 0.01 % ora_arc2_baaad1
29474 be/4 oracle 768.00 K 0.00 B 0.00 % 0.00 % ora_arc1_baaad1
6018 rt/4 grid 476.00 K 0.00 B 0.00 % 0.05 % ocssd.bin
--//感觉不对,不大可能这么少.我监测时间至少1分钟.
--//看另外的生产系统:
SYS@good1> select session_type,count(*),min(sample_time),max(sample_time) from V$ACTIVE_SESSION_HISTORY where event = 'control file sequential read' group by session_type;
SESSION_TY COUNT(*) MIN(SAMPLE_TIME) MAX(SAMPLE_TIME)
---------- ---------- ------------------------ -----------------------
BACKGROUND 278 2018-11-28 07:33:13.367 2018-11-29 09:31:41.357
--//基本都是后台操作.
--//有问题系统:
SYS@baaad1> select session_type,count(*),min(sample_time),max(sample_time) from V$ACTIVE_SESSION_HISTORY where event = 'control file sequential read' group by session_type;
SESSION_TY COUNT(*) MIN(SAMPLE_TIME) MAX(SAMPLE_TIME)
---------- ---------- ----------------------- ------------------------
FOREGROUND 8547 2018-11-07 20:14:45.402 2018-11-29 09:32:46.703
BACKGROUND 3048 2018-11-07 20:18:18.712 2018-11-29 09:24:23.885
--//奇怪为什么存在前台的control file sequential read等待事件.
SELECT machine, COUNT (*)
FROM V$ACTIVE_SESSION_HISTORY
WHERE event = 'control file sequential read' AND session_type = 'FOREGROUND'
GROUP BY machine
5 ORDER BY 2 DESC;
MACHINE COUNT(*)
-------------------- ----------
dbra 8272
baaad1 267
--//主要来自主机dbra的机器.噢这时我才想起来,我们这套系统引入一个变态的dg系统.一定是这台机器不停的发出一些sql语句,需要大量的读取控制文件信息,
--//而导致的情况.真心的无语....国内一个公司打着所谓高科技专利的东西,做出来的东西真tmd的垃圾!!!
SELECT *
FROM ( SELECT sql_id, session_type, COUNT (*)
FROM V$ACTIVE_SESSION_HISTORY
WHERE event = 'control file sequential read' AND machine = 'dbra'
GROUP BY sql_id, session_type
ORDER BY 3 DESC)
WHERE ROWNUM <= 10;
SQL_ID SESSION_TY COUNT(*)
------------- ---------- ----------
6kkfgfjybkpfk FOREGROUND 3219
cwtrxyuuj3g9f FOREGROUND 12
4r17sm64wjavh FOREGROUND 11
9cqy7104jtqv8 FOREGROUND 10
gb34cg8jky3zv FOREGROUND 10
82tkft36vtcmy FOREGROUND 10
4661q3grpwg82 FOREGROUND 10
4fghushmxbv20 FOREGROUND 10
dtd4kmuz6bf80 FOREGROUND 10
bkgd2n3hyp7f5 FOREGROUND 10
10 rows selected.
--//主要集中在sql_id=6kkfgfjybkpfk.
SYS@baaad1> @ sql_id 6kkfgfjybkpfk
old 1: select sql_id,sql_fulltext sqltext from gv$sqlarea where sql_id='&1'
new 1: select sql_id,sql_fulltext sqltext from gv$sqlarea where sql_id='6kkfgfjybkpfk'
SQL_ID SQLTEXT
------------- --------------------------------------------------------------------------------------------------------------------
6kkfgfjybkpfk select 'flashpct;'||trim(ceil(percent_space_used)) from V$FLASH_RECOVERY_AREA_USAGE where file_type='FLASHBACK LOG'
union all
select 'flashtime;'||to_char(OLDEST_FLASHBACK_TIME,'yyyy-mm-dd hh24:mi:ss') from V$FLASHBACK_DATABASE_LOG
SYS@baaad1> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 412
physical read total bytes 55 0 412
select 'flashpct;'||trim(ceil(percent_space_used)) from V$FLASH_RECOVERY_AREA_USAGE where file_type='FLASHBACK LOG'
union all
select 'flashtime;'||to_char(OLDEST_FLASHBACK_TIME,'yyyy-mm-dd hh24:mi:ss') from V$FLASHBACK_DATABASE_LOG
;
'FLASHPCT;'||TRIM(CEIL(PERCENT_SPACE_USED))
-------------------------------------------------
flashpct;0
SYS@baaad1> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 412
physical read total bytes 55 13926400 412
--//看看一次访问导致的物理读,13926400/1024/1024 = 13.28M
--//当我查询如下,sample_time时间按照前面dstat确定:
SELECT sample_time
,sql_id
,machine
,event
FROM V$ACTIVE_SESSION_HISTORY
WHERE sample_time BETWEEN '2018-11-29 09:23:38' AND '2018-11-29 09:23:42';
SAMPLE_TIME SQL_ID MACHINE EVENT
----------------------- ------------- -------------------- -------
2018-11-29 09:23:40.825 6kzu9bz2qf6vc dbra
--//运气好,抓到1个.不一定每次都能抓到.
SYS@baaad1> @ sql_id 6kzu9bz2qf6vc
SQL_ID SQLTEXT
------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
6kzu9bz2qf6vc select 'archinfo;'||name||';'||dest_id||';'||thread#||';'||sequence#||';'||to_char(FIRST_CHANGE#)||';'||to_char(NEXT_CHANGE#)||';'||status||';'||to_char(completion_time,'yyyy-mm-dd hh24:mi:ss')||';'||
DELETED from v$archived_Log where next_change#>26408430794 and RESETLOGS_CHANGE# in (select resetlogs_change# from v$database) and deleted='NO' and STANDBY_DEST='NO' and status='A'
union all
select 'scninfo;'||to_char(current_scn) from v$database
union all
select 'dbidinfo;'||to_char(dbid)||';'||to_char(resetlogs_change#) from v$database
union all
select 'loginfo;'||group#||';'||thread#||';'||sequence#||';'||to_char(bytes)||';'||members||';'||archived||';'||status||';'||to_char(first_change#)||';'||to_char(first_time,'yyyymmdd hh24:mi:ss')||';'
||to_char(sysdate,'yyyy-mm-dd hh24:mi:ss') from v$Log
union all
select 'tpsinfo;'||ts#||';'||name||';'||included_in_database_backup from v$tablespace
union all
select 'userinfo;'||username||';'||user_id||';'||account_status||';'||to_char(lock_date,'yyyy-mm-dd hh24:mi:ss')||';'||to_char(expiry_date,'yyyy-mm-dd hh24:mi:ss')||';'||default_tablespace||';'||
temporary_tablespace from dba_users
union all
select 'controlinfo;'||NAME from v$controlfile
union all
select 'delarchinfo;'||to_char(min(next_change#)) from v$archived_Log where RESETLOGS_CHANGE# in (select resetlogs_change# from v$database) and deleted='NO' and STANDBY_DEST='NO' and status='A'
--//查询到sql_id=6kzu9bz2qf6vc.问题在于这些语句还没有使用绑定变量.当你查询如下.
SELECT distinct plan_hash_value
FROM DBA_HIST_SQL_PLAN
WHERE sql_id IN ('cwtrxyuuj3g9f'
,'4r17sm64wjavh'
,'9cqy7104jtqv8'
,'gb34cg8jky3zv'
,'82tkft36vtcmy'
,'4661q3grpwg82'
,'4fghushmxbv20'
,'dtd4kmuz6bf80'
,'bkgd2n3hyp7f5')
PLAN_HASH_VALUE
---------------
3729576189
--//执行计划都是一样的实际上执行都是类似sql_id=6kzu9bz2qf6vc一样的语句.仅仅next_change#>26408430794不同罢了.可以通过如下语句确定:
select * from DBA_HIST_SQLTEXT
WHERE sql_id IN ('cwtrxyuuj3g9f'
,'4r17sm64wjavh'
,'9cqy7104jtqv8'
,'gb34cg8jky3zv'
,'82tkft36vtcmy'
,'4661q3grpwg82'
,'4fghushmxbv20'
,'dtd4kmuz6bf80'
,'bkgd2n3hyp7f5');
--//结果不再贴出了.
--//我自己手工执行:
SYS@baaad1> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 412
physical read total bytes 55 0 412
--//执行sql_id=6kzu9bz2qf6vc.语句....
SYS@baaad1> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 412
physical read total bytes 55 23625728 412
--//物理读比前面的还高.23625728/1024/1024 = 22.53M.
--//不知道大家看出规律来没有,1个22.52M,1个13.28M.就是这2个sql语句不断执行,间隔大约17-18秒.正好与我前面使用dstat看到的结果
--//相符合,频繁调用读取控制文件导致的.
--//这也看出asm一些"弊端"(纯粹个人的看法)^_^.
--//使用asm,由于没有文件系统的缓存,基本是直接读取磁盘设备,控制文件信息不会进入数据缓存,这样导致大量的物理读控制文件.
--//频繁调用,导致awr报表看到的情况.
--//补充1个使用文件系统的例子:
SYS@zzz430> select * from v$version where rownum=1;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
SYS@zzz430> show parameter filesystem
NAME TYPE VALUE
-------------------- ------ -------
filesystemio_options string ASYNCH
--//反复调用如下语句:
select 'flashpct;'||trim(ceil(percent_space_used)) from V$FLASH_RECOVERY_AREA_USAGE where file_type='FLASHBACK LOG'
union all
select 'flashtime;'||to_char(OLDEST_FLASHBACK_TIME,'yyyy-mm-dd hh24:mi:ss') from V$FLASHBACK_DATABASE_LOG
;
# dstat -t -d -D sda1,sda3,total
-----time----- --dsk/sda1----dsk/sda3---dsk/total-
date/time | read writ: read writ: read writ
29-11 11:07:22| 490B 5793B: 23k 52k: 47k 116k
29-11 11:07:23| 0 0 : 40k 88k: 80k 176k
29-11 11:07:24| 0 0 : 0 36k: 0 72k
29-11 11:07:25| 0 0 : 0 156k: 0 312k
29-11 11:07:26| 0 0 : 0 8192B: 0 16k
29-11 11:07:27| 0 160k: 0 52k: 0 424k
29-11 11:07:28| 0 0 :8192B 112k: 16k 224k
29-11 11:07:29| 0 0 : 0 100k: 0 200k
29-11 11:07:30| 0 0 : 0 44k: 0 88k
29-11 11:07:31| 0 0 :8192B 92k: 16k 184k
29-11 11:07:32| 0 0 : 0 92k: 0 184k
29-11 11:07:33| 0 0 : 0 44k: 0 88k
29-11 11:07:34| 0 0 : 0 52k: 0 104k
29-11 11:07:35| 0 0 : 0 44k: 0 88k
29-11 11:07:36| 0 0 : 0 56k: 0 112k
29-11 11:07:37| 0 0 : 0 184k: 0 368k
29-11 11:07:38| 0 0 : 0 40k: 0 80k
29-11 11:07:39| 0 12k: 16k 64k: 32k 152k
29-11 11:07:40| 0 0 : 0 32k: 0 64k
--// 数据库在sda3磁盘上你可以发现根本没有读操作.
# ls -1 /u01/app/oracle/oradata/zzz430/control0*.ctl | xargs -I {} cachestats {}
/u01/app/oracle/oradata/zzz430/control01.ctl pages in cache: 2876/2884 (99.7%) [filesize=11536.0K, pagesize=4K]
/u01/app/oracle/oradata/zzz430/control02.ctl pages in cache: 260/2884 (9.0%) [filesize=11536.0K, pagesize=4K]
--//你可以发现文件系统已经缓存了控制文件control01.ctl.反复调用不再存在物理读.
--//实际上如果你修改oracle参数filesystemio_options=setall,directio,就会看到物理读.再做一个测试:
--//在测试系统进行,因为改参数filesystemio_options要重启数据库.
SYS@book> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
SYS@book> show parameter filesystem
NAME TYPE VALUE
-------------------- ------ ---------
filesystemio_options string directio
$ cat c.txt
select 'flashpct;'||trim(ceil(percent_space_used)) from V$FLASH_RECOVERY_AREA_USAGE where file_type='FLASHBACK LOG'
union all
select 'flashtime;'||to_char(OLDEST_FLASHBACK_TIME,'yyyy-mm-dd hh24:mi:ss') from V$FLASHBACK_DATABASE_LOG
;
host sleep 0.99
--//执行如下:
sqlplus / as sysdba <<EOF
$(seq 1 100 | xargs -I {} cat c.txt)
EOF
# dstat -t -d -D cciss/c0d0p6,total
-----time----- dsk/cciss/c--dsk/total-
date/time | read writ: read writ
29-11 11:55:26| 784k 33k:1568k 65k
29-11 11:55:27| 784k 33k:1568k 65k
29-11 11:55:28| 848k 33k:1696k 65k
29-11 11:55:29| 784k 85k:1568k 169k
29-11 11:55:30| 784k 33k:1568k 65k
29-11 11:55:31| 848k 33k:1696k 185k
29-11 11:55:32| 784k 33k:1568k 65k
29-11 11:55:33| 784k 33k:1568k 65k
29-11 11:55:34| 848k 45k:1696k 89k
29-11 11:55:35| 784k 65k:1568k 129k
29-11 11:55:36| 784k 33k:1568k 65k
29-11 11:55:37| 848k 33k:1696k 113k
29-11 11:55:38| 784k 33k:1568k 65k
29-11 11:55:39| 848k 45k:1696k 89k
--//1秒调用一次.基本每秒读取784K.
--//??这个版本的dstat可能有bug,看到total加倍了.
SYS@book> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 28
physical read total bytes 55 0 28
SYS@book> @ b.txt
'FLASHPCT;'||TRIM(CEIL(PERCENT_SPACE_USED))
-------------------------------------------------
flashpct;0
SYS@book> @ viewsess 'physical read total bytes'
NAME STATISTIC# VALUE SID
---------------------------------------------------------------------- ---------- ---------- ----------
physical read total bytes optimized 53 0 28
physical read total bytes 55 802816 28
--//802816/1024 = 784k,基本符合测试结果.
--//换成filesystemio_options=setall也一样.贴出dstat的输出.
SYS@book> show parameter filesystem
NAME TYPE VALUE
-------------------- ------ -------
filesystemio_options string setall
--//执行如下:
sqlplus / as sysdba <<EOF
$(seq 1 30 | xargs -I {} cat c.txt)
EOF
# dstat -t -d -D cciss/c0d0p6,total
-----time----- dsk/cciss/c--dsk/total-
date/time | read writ: read writ
29-11 15:45:05| 784k 33k:1568k 65k
29-11 15:45:06| 848k 33k:1696k 193k
29-11 15:45:07| 784k 45k:1568k 89k
29-11 15:45:08| 784k 33k:1568k 65k
29-11 15:45:09| 848k 33k:1696k 65k
29-11 15:45:10| 784k 33k:1568k 65k
29-11 15:45:11| 784k 33k:1568k 121k
29-11 15:45:12| 848k 45k:1696k 89k
29-11 15:45:13| 784k 33k:1568k 65k
29-11 15:45:14| 784k 33k:1568k 65k
29-11 15:45:15| 848k 33k:1696k 65k
29-11 15:45:16| 784k 33k:1568k 65k
29-11 15:45:17| 848k 45k:1696k 89k
29-11 15:45:18| 848k 33k:1696k 65k
29-11 15:45:19| 784k 33k:1568k 65k
29-11 15:45:20| 784k 33k:1568k 65k
29-11 15:45:21| 848k 33k:1696k 89k
29-11 15:45:22| 784k 45k:1568k 89k
29-11 15:45:23| 784k 33k:1568k 65k
29-11 15:45:24| 848k 33k:1696k 65k
29-11 15:45:25| 784k 33k:1568k 65k^C
总结:
--//实际上对于国内所谓的公司真心无语,打着高科技的幌子骗人.
--//如果要是N年以前的硬件设备,这样的应用可能已经"不堪重负"了.
--//再回头看awr报表发现:
Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO SQL Id SQL Module SQL Text
...
13.72 211 0.07 1.72 57.15 2.68 6kkfgfjybkpfk SQL*Plus select 'flashpct;'||trim(ceil(...
11.66 102 0.11 1.46 56.38 1.74 bf2j0w3avts8g SQL*Plus select 'archinfo;'||name||';'|...
10.23 97 0.11 1.28 61.88 0.93 6v7rcfxb9mftr SQL*Plus select 'archinfo;'||name||';'|...
--//执行sql_id=6kkfgfjybkpfk,211次.因为对方的dg是日志传输完后在应用日志,为了避免长时间日志不应用.设置log_checkpoint_timeout=1800,
--//也就是30分钟切换1次.这样 102+97 = 199,我估计还漏掉1个类似的sql语句.
SYS@baaad1> show parameter log_checkpoint_timeout
NAME TYPE VALUE
---------------------- ------- ------
log_checkpoint_timeout integer 1800
--//3600/211 = 17.06161137440758293838 基本17秒循环执行1次.
--//这样前面的计算做一些简单修正:
--//3600/17.06*(22.53+13.28) = 7556.62M,这样就非常接近awr报表的统计数量了.
--//在SQL ordered by Physical Reads (UnOptimized)
UnOptimized Read Reqs Physical Read Reqs Executions UnOptimized Reqs per Exec %Opt %Total SQL Id SQL Module SQL Text
15,724 15,724 211 74.52 0.00 662.06 6kkfgfjybkpfk SQL*Plus select 'flashpct;'||trim(ceil(...
10,307 10,307 102 101.05 0.00 433.98 bf2j0w3avts8g SQL*Plus select 'archinfo;'||name||';'|...
9,603 9,603 97 99.00 0.00 404.34 6v7rcfxb9mftr SQL*Plus select 'archinfo;'||name||';'|...
--//awr报表时间段是2018-11-26 10:00 到 2018-11-26 11:00,看看alert.log文件:
Mon Nov 26 10:01:15 2018
LGWR: Standby redo logfile selected for thread 1 sequence 38454 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 38454 (LGWR switch)
Current log# 1 seq# 38454 mem# 0: +DATA/baaad/onlinelog/redo01.redo
Mon Nov 26 10:01:16 2018
LNS: Standby redo logfile selected for thread 1 sequence 38454 for destination LOG_ARCHIVE_DEST_3
Mon Nov 26 10:01:16 2018
Archived Log entry 151141 added for thread 1 sequence 38453 ID 0x3647f882 dest 1:
Mon Nov 26 10:31:17 2018
LGWR: Standby redo logfile selected for thread 1 sequence 38455 for destination LOG_ARCHIVE_DEST_2
Thread 1 advanced to log sequence 38455 (LGWR switch)
Current log# 2 seq# 38455 mem# 0: +DATA/baaad/onlinelog/redo02.redo
Mon Nov 26 10:31:18 2018
Archived Log entry 151149 added for thread 1 sequence 38454 ID 0x3647f882 dest 1:
Mon Nov 26 10:31:18 2018
LNS: Standby redo logfile selected for thread 1 sequence 38455 for destination LOG_ARCHIVE_DEST_3
Mon Nov 26 11:01:16 2018
LGWR: Standby redo logfile selected for thread 1 sequence 38456 for destination LOG_ARCHIVE_DEST_2
--//发生了2次切换.时间并没有对齐整点.这样类似sql_id=bf2j0w3avts8g,6v7rcfxb9mftr漏掉几次.
--//10:01:16, 漏掉76秒,76/17.06 = 4.45.有点差距.不探究了.
SELECT *
FROM V$ACTIVE_SESSION_HISTORY
WHERE sql_id IN ('6kkfgfjybkpfk', 'bf2j0w3avts8g', '6v7rcfxb9mftr')
AND sample_time BETWEEN '2018-11-26 10:00:00'
AND '2018-11-26 11:00:00'
order by 2 ;
--//看到session_id,session_serial#不一样的,好像是分开执行的.不再探究这些细节了.这也是为什么前面使用iotop看不到的原因,因
--//为很快执行完成退出了.
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/267265/viewspace-2222146/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/267265/viewspace-2222146/