Oracle性能分析(一) DB Time和DB CPU

分析数据库的负载高低,通常可以通过CPU利用率,磁盘响应速度等进行分析,Oracle数据库提供AWR报告,报告的内容很多很多,但是却没有一个定性的指标,能回答:

1. 到底数据库当前的负载是高还是低?

2. 我感觉我数据库很慢,我的数据库硬件是是不是应该升级了?

 

先看一段典型的AWR报告,截取部分片段

 

Host NamePlatformCPUsCoresSocketsMemory (GB)
ABCDLinux x86 64-bit12121294.15
 Snap IdSnap TimeSessionsCursors/SessionInstances
Begin Snap:1364724-Aug-18 00:00:405754.12
End Snap:1369525-Aug-18 00:00:065584.62
Elapsed: 1,439.43 (mins)   
DB Time: 2,762.39 (mins)   

这里有三个关键信息:

1. 首先,我们的服务器是12个CPU,12核

2. 这个AWR报告的区间是从  24-Aug-18 00:00:40 到 25-Aug-18 00:00:06,整个区间是1,439.43 (mins), 24小时。

3. 然后是DB TIME: 2,762.39 (mins), 这是一个关键的性能指标,就是所有数据库请求的运行时间加总。

 

举个栗子,假设一家理发店24小时(1,439.43 (mins))营业,一共有12个理发师(12 CPU),顾客在理发店的累计逗留时间就是 DB TIME:2,762.39 (mins)。

如果拿顾客累计逗留时间除以营业时间,2762.30/1439.43=1.92, 也就是说,在营业时间内,店内平均有1.92个顾客。

顾客的累计逗留时间(DB TIME)不是理发师(CPU)的累计工作时间,如果两个顾客都要烫发,正巧只有一个烫发机,则其中一个顾客需要等待(阻塞),这段时间理发师没有干活,

所以:

DB CPU 是一定会小于 DB Time的, 理发师剪头发的时间一定小于顾客在店内逗留的时间。

 

继续看AWR报告,给出了理发师的工作时间,DB CPUS(S)=0.9 ,我们有12个理发师,也就是全天CPU的利用率是 0.9/12=7.5%

Load Profile

 

 

 

 Per SecondPer TransactionPer ExecPer Call
DB Time(s):1.90.10.000.00
DB CPU(s):0.90.00.000.00
Redo size (bytes):2,286,644.369,730.7  
Logical read (blocks):117,205.03,574.1  
Block changes:9,588.5292.4  
Physical read (blocks):1,768.053.9  
Physical write (blocks):287.08.8  
Read IO requests:201.76.2  
Write IO requests:75.02.3  
Read IO (MB):13.80.4  
Write IO (MB):2.20.1  
Global Cache blocks received:29.10.9  
Global Cache blocks served:37.51.1  
User calls:3,122.895.2  
Parses (SQL):473.314.4  
Hard parses (SQL):9.50.3  
SQL Work Area (MB):15.80.5  
Logons:0.10.0  
Executes (SQL):11,292.7344.4  
Rollbacks:0.70.0  
Transactions:32.8   

 

全天CPU利用率只有7.5%,整体看比较低,但是不意味着数据库的负荷很低,有两种情况:

1. 负荷可能集中在某个时间段,这就需要我们观察每个时间段的数据库负荷。

2. 数据库的瓶颈可能不在CPU,而在磁盘或者网络

1)数据库内存较少,大部分时间消耗在磁盘的读取上,这个可以从磁盘的响应速度判断

2)网络带宽不足,数据库要发送大量数据到客户端,在这里形成了等待。

  

所以现在我们想分时间段获得DB TIME和DB CPU这两个指标,如果需要按每个时间段导出一个AWR报告,这是一件令人抓狂的事,还好,AWR报告实际上就是通过一些数据库视图生成的,这些视图就是数据库定时的快照,所以我们可以利用SQL语句很方便的生成。

首先看下快照的主表:

SELECT * FROM DBA_HIST_SNAPSHOT ORDER BY SNAP_ID

   SNAP_IDDBIDINSTANCE_NUMBERSTARTUP_TIMEBEGIN_INTERVAL_TIMEEND_INTERVAL_TIMEFLUSH_ELAPSEDSNAP_LEVELERROR_COUNTSNAP_FLAGSNAP_TIMEZONE
1136481647988888210-8月 -18   05.20.05.000 下午24-8月 -18   12.00.40.204 上午24-8月 -18   12.30.25.506 上午+00000 00:00:00.5100+0 08:00:00
2136481647988888128-7月 -18   08.01.15.000 下午24-8月 -18   12.00.40.229 上午24-8月 -18   12.30.25.524 上午+00000 00:00:02.5100+0 08:00:00
3136491647988888128-7月 -18   08.01.15.000 下午24-8月 -18   12.30.25.524 上午24-8月 -18   01.00.16.839 上午+00000 00:00:02.2100+0 08:00:00
4136491647988888210-8月 -18   05.20.05.000 下午24-8月 -18   12.30.25.506 上午24-8月 -18   01.00.16.820 上午+00000 00:00:01.2100+0 08:00:00
5136501647948888128-7月 -18   08.01.15.000 下午24-8月 -18   01.00.16.839 上午24-8月 -18   01.30.08.125 上午+00000 00:00:02.9100+0 08:00:00
6136501647988888210-8月 -18   05.20.05.000 下午24-8月 -18   01.00.16.820 上午24-8月 -18   01.30.08.072 上午+00000 00:00:00.9100+0 08:00:00
7136511647988888128-7月 -18   08.01.15.000 下午24-8月 -18   01.30.08.125 上午24-8月 -18   02.00.02.352 上午+00000 00:00:01.6100+0 08:00:00
8136511647988888210-8月 -18   05.20.05.000 下午24-8月 -18   01.30.08.072 上午24-8月 -18   02.00.02.334 上午+00000 00:00:00.7100+0 08:00:00

 通过这个表我们可以判断如下:

1. 半小时一个快照(这个是配置值,可改),快照并不是完全的整点,存在一定的误差,即快照的区间接近半小时,可能会多几十秒或者少几十秒。

2. 这里有两个INSTANCE_NUMBER,因为这个是集群,所以实际上有两个物理机。

 

DBA_HIST_SYS_TIME_MODEL 表里面有我们需要的DB CPU和DB time,这个值是累计值,即需要算一个区间的数,需要那结束时间的值减去开始时间的值。

SELECT * FROM DBA_HIST_SYS_TIME_MODEL
WHERE SNAP_ID='13648'
AND STAT_NAME IN ('DB CPU','DB time')

   SNAP_IDDBIDINSTANCE_NUMBERSTAT_IDSTAT_NAMEVALUE
113648164798888822748282437DB CPU41061757271
213648164798888812748282437DB CPU2214287108077
313648164798888823649082374DB time105530922248
413648164798888813649082374DB time3992370209915

 

 

创建一个表存储原始数据,

CREATE TABLE SYSTEMLOAD AS 
WITH TEMP AS(
  SELECT a.INSTANCE_NUMBER,
         to_number(to_char(A.BEGIN_INTERVAL_TIME,'yyyymmdd')) AS BEGIN_DATE,
         TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48) AS SNAPORDER,
         SUBSTR('0' || TRUNC(TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48)/2) ||
         (CASE WHEN mod(TRUNC(((A.BEGIN_INTERVAL_TIME+0)-trunc(A.BEGIN_INTERVAL_TIME+0))*48),2)=1 THEN '30' ELSE '00' END),-4)
         AS BEGIN_TIME,
         A.BEGIN_INTERVAL_TIME+0 AS BEGIN_INTERVAL_TIME,
         A.END_INTERVAL_TIME+0 AS END_INTERVAL_TIME,
         ((A.END_INTERVAL_TIME+0)-(A.BEGIN_INTERVAL_TIME+0))*60*60*24 AS TIMEINTERVAL,
         B.SNAP_ID AS STARTSNAP_ID,
         A.SNAP_ID AS ENDSNAP_ID
  FROM DBA_HIST_SNAPSHOT A
  INNER JOIN DBA_HIST_SNAPSHOT B  /*寻找前一个SNAP_ID*/
  ON A.SNAP_ID-1=B.SNAP_ID
  AND A.INSTANCE_NUMBER=B.INSTANCE_NUMBER
  ORDER BY A.SNAP_ID
)
SELECT A.*,
       DBTIME_E.VALUE-DBTIME_S.VALUE AS DBTIME,
       DBCPU_E.VALUE-DBCPU_S.VALUE AS DBCPU,  
       REDOSIZE_E.VALUE-REDOSIZE_S.VALUE AS REDOSIZE,
       PHYSICALREAD_E.VALUE-PHYSICALREAD_S.VALUE AS PHYSICALREAD,
       PHYSICALWRITE_E.VALUE-PHYSICALWRITE_S.VALUE AS PHYSICALWRITE      
FROM TEMP A
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBTIME_S 
ON a.STARTSNAP_ID=DBTIME_S.SNAP_ID
AND A.INSTANCE_NUMBER=DBTIME_S.INSTANCE_NUMBER
AND DBTIME_S.STAT_NAME='DB time'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBTIME_E
ON a.ENDSNAP_ID=DBTIME_E.SNAP_ID
AND A.INSTANCE_NUMBER=DBTIME_E.INSTANCE_NUMBER
AND DBTIME_E.STAT_NAME='DB time'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBCPU_S 
ON a.STARTSNAP_ID=DBCPU_S.SNAP_ID
AND A.INSTANCE_NUMBER=DBCPU_S.INSTANCE_NUMBER
AND DBCPU_S.STAT_NAME='DB CPU'
LEFT OUTER JOIN DBA_HIST_SYS_TIME_MODEL  DBCPU_E
ON a.ENDSNAP_ID=DBCPU_E.SNAP_ID
AND A.INSTANCE_NUMBER=DBCPU_E.INSTANCE_NUMBER
AND DBCPU_E.STAT_NAME='DB CPU'
LEFT OUTER JOIN DBA_HIST_SYSSTAT REDOSIZE_S
ON A.STARTSNAP_ID=REDOSIZE_S.SNAP_ID
AND A.INSTANCE_NUMBER=REDOSIZE_S.INSTANCE_NUMBER
AND REDOSIZE_S.STAT_NAME='redo size'
LEFT OUTER JOIN DBA_HIST_SYSSTAT REDOSIZE_E
ON A.ENDSNAP_ID=REDOSIZE_E.SNAP_ID
AND A.INSTANCE_NUMBER=REDOSIZE_E.INSTANCE_NUMBER
AND REDOSIZE_E.STAT_NAME='redo size'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALREAD_S
ON A.STARTSNAP_ID=PHYSICALREAD_S.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALREAD_S.INSTANCE_NUMBER
AND PHYSICALREAD_S.STAT_NAME='physical read bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALREAD_E
ON A.ENDSNAP_ID=PHYSICALREAD_E.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALREAD_E.INSTANCE_NUMBER
AND PHYSICALREAD_E.STAT_NAME='physical read bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALWRITE_S
ON A.STARTSNAP_ID=PHYSICALWRITE_S.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALWRITE_S.INSTANCE_NUMBER
AND PHYSICALWRITE_S.STAT_NAME='physical write bytes'
LEFT OUTER JOIN DBA_HIST_SYSSTAT PHYSICALWRITE_E
ON A.ENDSNAP_ID=PHYSICALWRITE_E.SNAP_ID
AND A.INSTANCE_NUMBER=PHYSICALWRITE_E.INSTANCE_NUMBER
AND PHYSICALWRITE_E.STAT_NAME='physical write bytes'
AND DBTIME_E.VALUE-DBTIME_S.VALUE>0 /*重启可能导致重新累积,不加入这些异常数据*/
--WHERE A.INSTANCE_NUMBER=1
ORDER BY A.STARTSNAP_ID

 

输出单日的运行指标:

SELECT BEGIN_DATE,
       MAX(BEGIN_TIME) AS BEGIN_TIME,
       ROUND(SUM(DBTIME)/SUM(TIMEINTERVAL)/1000000,2) AS AVGDBTIME,
       ROUND(SUM(DBCPU)/SUM(TIMEINTERVAL)/1000000,2) AS AVGDBCPU,
       ROUND(SUM(REDOSIZE)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGREDOSIZE ,
       ROUND(SUM(PHYSICALREAD)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGPHYSICALREAD,
       ROUND(SUM(PHYSICALWRITE)/SUM(TIMEINTERVAL)/1024/1024,2) AS AVGPHYSICALWRITE
FROM SYSTEMLOAD
WHERE INSTANCE_NUMBER=1
AND BEGIN_DATE='20180831'
GROUP BY BEGIN_DATE,SNAPORDER
ORDER BY BEGIN_DATE,SNAPORDER

 

以下为输出,可以看出DB TIME和DB CPU其实都不高,如果算CPU的利用率,将AVGDBCPU除以12即可,大概不到20%

   BEGIN_DATEBEGIN_TIMEAVGDBTIMEAVGDBCPUAVGREDOSIZEAVGPHYSICALREADAVGPHYSICALWRITE
12018083101.711.160.981.371.19
220180831300.980.890.310.140.44
3201808311001.310.880.280.120.43
4201808311301.191.060.470.050.55
5201808312001.791.380.494.810.69
6201808312301.751.620.670.580.75
7201808313001.421.330.530.730.73
8201808313301.441.280.595.460.67
9201808314000.960.920.20.020.37
10201808314301.030.950.40.060.38
11201808315001.161.021.031.471.32
12201808315301.1110.40.650.4
13201808316001.741.212.085.511.46
14201808316301.371.130.511.50.57
15201808317001.241.120.410.660.64
16201808317301.291.060.622.060.66
17201808318005.892.023.6443.823.93
18201808318302.281.661.8211.512.21
19201808319003.292.083.6912.463.15
20201808319303.151.77.0827.657.41
212018083110002.371.366.320.125.92
222018083110301.931.15.4625.375.04
232018083111002.541.35.5532.085.92
242018083111301.761.034.8217.465.66
252018083112001.030.572.1918.682.47
262018083112301.220.892.6717.343.04
272018083113001.681.144.7321.534.47
282018083113301.921.054.8321.425.06
292018083114001.881.235.2122.945.14
302018083114301.180.654.0310.044.96
312018083115001.320.622.976.282.83
322018083115301.680.913.8514.675.12
332018083116005.561.055.941.365.46
342018083116302.820.863.347.824.21
352018083117002.851.313.2437.84.28
362018083117302.71.163.4319.224.11
372018083118002.360.732.5725.63.81
382018083118301.821.121.9817.046.45
392018083119001.630.832.3615.453.15
402018083119300.90.481.594.322.08
412018083120001.090.281.019.932.01
422018083120300.610.210.850.211.08
432018083121000.490.210.770.211.03
442018083121300.410.20.961.191.06
452018083122002.350.722.7922.784.89
462018083122300.780.390.828.11.54
472018083123000.20.120.660.530.87
482018083123300.190.10.530.070.77

 

总结:

统计了DB CPU和DB TIME两个关键指标,但是DB CPU是数据库层面的数据,并不是服务器层面的数据,有可能服务器上面还运行了其他程序,占用了大量的CPU,导致DB CPU这个值很低,但是实际上服务器的CPU非常繁忙。下一篇文章将获得服务器层面的CPU利用率。

 

转载于:https://www.cnblogs.com/artmouse/p/9570083.html

  • 0
    点赞
  • 9
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值