addm报告常见解析

wyh_0620

于 2023-02-01 11:38:31 发布

阅读量294

点赞数

分类专栏： ORACLE 文章标签：数据库 sql oracle

本文链接：https://blog.csdn.net/wangyh_0412/article/details/128830071

版权

ORACLE 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

本文解析一个数据量为3T的oracle数据库的每日批处理工作的addm报告~~~

1）addmrpt_1_681_682.txt line 25

RECOMMENDATION 2: DB Configuration, 82% benefit (14715 seconds)
     ACTION: Increase the size of the log files to 2048 M to hold at least 20
        minutes of redo information.

redo log file过小，造成日志切换频繁，addm报告建议每个redo log file扩至2G。（现在是和 1.18一样500M，已经扩完了，即时生效）

2）addmrpt_1_682_683.txt line 21

ACTION: Investigate the SQL statement with SQL_ID "77k4cjhvc5rvz" for
        possible performance improvements.

报告里提出了很多这种sql需要优化的建议，具体可查看各个addm报告搜索【performance improvements】。

3）addmrpt_1_682_683.txt line 565

RATIONALE: SQL statement with SQL_ID "77k4cjhvc5rvz" was executed 1
        times and had an average elapsed time of 17403 seconds.
     RATIONALE: At least one execution of the statement ran in parallel.
     RATIONALE: Waiting for event "PX Deq Credit: send blkd" in wait class
        "Other" accounted for 82% of the database time spent in processing
        the SQL statement with SQL_ID "77k4cjhvc5rvz".
     RATIONALE: Waiting for event "latch: parallel query alloc buffer" in
        wait class "Other" accounted for 3% of the database time spent in
        processing the SQL statement with SQL_ID "77k4cjhvc5rvz".
     RATIONALE: Waiting for event "PX qref latch" in wait class "Other"
        accounted for 2% of the database time spent in processing the SQL
        statement with SQL_ID "77k4cjhvc5rvz".

如上所示，82%消耗在【Waiting for event “PX Deq Credit: send blkd” in wait class “Other”】-------------------这个发生了很多次，在别的时间段也出现过
这种情况就是在对表或者索引查询的时候，并行度过高造成了资源的竞争，导致互相等待时间过长。解决的方法就是降低每个并行执行的并行度，比如对象（表，索引）上预设的并行度或者查询Hint 指定的并行度。

4）addmrpt_1_682_683.txt line 916

RATIONALE: Waiting for event "SQL*Net more data from client" in wait
        class "Network" accounted for 5% of the database time spent in
        processing the SQL statement with SQL_ID "6hz25cqrzp8f1".

这个很显而易见哈，网络等待消耗5%的DBtime

5）addmrpt_1_682_683.txt line 1510

RECOMMENDATION 1: DB Configuration, 0.49% benefit (192 seconds)
     ACTION: Increase the size of the SGA by setting the parameter
        "sga_target" to 76800 M.

addm报告建议把sga扩到76800M，现在是61440M（和1.18一样），已经写完参数了，重启数据库生效。

6）addmrpt_1_683_684.txt line 961

RATIONALE: Waiting for event "db file scattered read" in wait class
        "User I/O" accounted for 41% of the database time spent in processing
        the SQL statement with SQL_ID "6091wkzhftz0h".

这个意思是用户进行离散读次数很多，离散读就是相当于一会读磁盘的东面一会读西面，所以很慢。。。
造成这种报错的原因是全表扫或者快速索引全扫。。。
解决方法：
1.对相应的sql语句进行修改
2.重建索引
3.在WHERE条件中加入更多的索引字段

7）addmrpt_1_686_687.txt line 1590

RATIONALE: The I/O usage statistics for the object are: 1 full object
        scans, 9404941 physical reads, 5 physical writes and 0 direct reads.

显而易见，物理读很高，不知道为什么这个是全表扫的，SQL我没看有没有索引，建议看一下~~~~

8）addmrpt_1_686_687.txt line 2154

Individual SQL statements responsible for significant user I/O wait were
found.
addmrpt_1_686_687.txt line 2426
     SYMPTOM: Wait class "User I/O" was consuming significant database time.
              (17% impact [1696 seconds])

这个上面写的很清楚了，这个SQL对user IO等待消耗负责。。。
而且这一个小时内的报告里面，有很多【Wait class “User I/O”】，可以分析一下这个小时里执行的SQL逻辑~

9）addmrpt_1_688_689.txt line 1418

Time spent on the CPU by the instance was responsible for a substantial part
of database time.

这个在此后2个小时的报告里也出现过
显示的是实例在cpu上消耗的时间过长，执行top看了，cpu使用率和负载很高~

10）addmrpt_1_690_691.txt line 2728

ACTION: Use bigger fetch arrays while fetching results from the SELECT
        statement with SQL_ID "ghx3dgfr1g0by".

这个有个参数，叫arraysize，默认是15，怎么说呢，这个参数是15，那么你扫描一次数据库就给你返回15个数据块，如果这个值很小，那么就会扫描很多次，造成物理读过高，如果过大，
就会造成，需要返回的数据块很多，相应的redo、undo那些乱七八糟的就会消耗很多，然后我改成了100，即时生效。。。

11）addmrpt_1_691_692.txt line 903、2723

RATIONALE: Waiting for event "local write wait" in wait class "User I/O"
        accounted for 2% of the database time spent in processing the SQL
        statement with SQL_ID "av6x7tv72ypdk".

这个的意思就是本地写等待，等待的还是User io，也就是自己等自己。。。
造成这种问题的原因有两种：
1、磁盘有问题了，而且是很严重的问题，比如磁盘控制器坏掉了，但是极少发生。。。
2、truncate、drop表的时候，oracle必须使DATA BUFFER中所有该对象的数据块失效或者刷新到磁盘，步骤为此时请求RO队列锁，找缓冲区中该对象的块，
并使其无效化或者刷新到磁盘，然后释放RO锁，如果多个进程并发地进行TRUNCATE的时候，TRUNCATE等待相关的块刷新到磁盘，就表现为等待时间local write wait.
这就得调整一下truncate的逻辑顺序啥的~