log file sync case sharing

category: oracle database

issue: performance

apply to: architects/support/operation

 

background:The customer suffered performance issue in the database and got complain from business.They analyzed it but no findings, so they look for help .

 

Check AWR during the issue,the main cause is log file sync, it used more than 64% wait event, it is abnormal, although the average time is not very high.We can see the IO performance is good.

node1:

 

node2:

 

"log file sync" is the direct cause.

Check ASH dump ,there are many foreground sessions blocked by LGWR progess, LGWR did not blocked by others.

node1:

 

node2:

 

LGWR process worked slowly or something else make it slow, We have checked the IO performance is good enough. Let's check the commit stages in the database.

1.User issue commit;

2.Foreground process post LGWR to write log.

3.LGWR write logs to disk.

4.LMS broadcasts the information to other instances in the RAC(RAC only)

5.LGWR get the feedback from other instances(RAC only).

6.LGWR feedback to foreground process that commit is successful.

You can see the most probably is the communication between RAC instances.

Check the efficiency of node communication."wait for scn ack" is the call when database need to broadcast commit information between instances.The average time is 53MS, it's not acceptable.

 

Check LMS trace to get more information, there are many latency warnings.

*** 2019-08-26 23:11:04.588

Warning: log write broadcast wait time 525ms (SCN 0x0.132e8901)  

*** 2019-08-26 23:11:04.588

Warning: log write broadcast wait time 523ms (SCN 0x0.132e8904)  

*** 2019-08-26 23:11:04.588

Warning: log write broadcast wait time 517ms (SCN 0x0.132e8907)  

*** 2019-08-26 23:11:04.588

Warning: log write broadcast wait time 516ms (SCN 0x0.132e890b)

So what the matter of private network?

625.61KB/S is small, the bandwidth of private network is good.

Estd Interconnect traffic (KB) 625.61

The ping latency is abnormal on node 1.Ping command is nothing to do with applications but OS.

 

Check CPU usage, I found the CPU is good enough, idle 93%.

 

So we can see the root cause is private network issue.We can use ping command at OS level to collection more information to avoid network engineer kicking the ball back to you at most situation.

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值