log file sync

最新推荐文章于 2021-07-19 22:44:44 发布

ckth47350

最新推荐文章于 2021-07-19 22:44:44 发布

阅读量563

点赞数

文章标签：数据库

log file sync

一：log file sync说明

二：log file sync解决思路

三：log file sync官方文档

一：log file sync说明

https://support.oracle.com/epmos/faces/DocContentDisplay?_afrLoop=241920801686769&id=1376916.1&_afrWindowMode=0&_adf.ctrl-state=11enkcwiqc_131

什么是log file sync wait？

(1)当用户进程执行提交操作时，由该会话事物生成的所有redo log buffer会被刷新到redo log file，以确保事物的持久性。

(2)当执行提交操作时，用户进程会通知LGWR进程将redo log buffer写入到redo log file，一旦LGWR写入完成后，会通知用户进程已完成。在LGWR通知用户写入完成之前，用户进程会出现log file sync等待。

(3)从用户进程通知LGWR进程开始写入，到LGWR进程写入完成，并反馈用户进程。这段时间就是log file sync等待时间。

(4)在11.2和更高的版本中，lgwr写日志引入了polling机制，而在以前只有post/wait机制。

polling机制可能会引起log file sync等待；

可能产生log file sync场景：

(1)过于频繁commit/rollback操作

(2)Redo log file所在磁盘IO性能较差

(3)Adaptive Log File Sync(11.2.0.3版本也建议关闭)

Adaptive Log File sync was introduced in 11.2. The parameter controlling this feature, _use_adaptive_log_file_sync, is set to false by default in 11.2.0.1 and 11.2.0.2.
In 11.2.0.3 the default is now true. When enabled, Oracle can switch between the 2 methods:

Post/wait, traditional method for posting completion of writes to redo log.

Polling, a new method where the foreground process checks if the LGWR has completed the write.

(4)Oracle BUG，详细内容可以查看

WAITEVENT: "log file sync" Reference Note (文档 ID 34592.1)

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=261518720252802&parent=DOCUMENT&sourceId=1376916.1&id=34592.1&_afrWindowMode=0&_adf.ctrl-state=17pjuvle0z_106

二：log file sync解决思路

https://support.oracle.com/epmos/faces/DocContentDisplay?_afrLoop=261526707572288&id=1376916.1&_afrWindowMode=0&_adf.ctrl-state=17pjuvle0z_155

出现log file sync等待可以收集哪些信息？

(1)AWR报告。

(2)LGWR trace日志(12.2及以上版本可以看LGnn trace日志)。
当redo写入时间比较长时，会在lgwr trace日志中生成警告信息。

例如：当lgwr写入超过500ms时，会在lgwr trace日志生成如下警告信息。

*** 2011-10-26 10:14:41.718
Warning: log write elapsed time 21130ms, size 1KB
(set event 10468 level 4 to disable this warning)

其中AWR报告可以关注如下信息:

Troubleshooting: 'Log file sync' Waits (文档 ID 1376916.1)

1：查看log file sync和log file parallel write等待时间，如果log file sync花在log file parallel write时间偏高，那么等待主要来源于I/O，根据经验值，log file parallel write平均等待时间超过20ms，I/O子系统可能存在问题。

（如果log file sync很高，但是log file parallel write很小，可能是系统CPU资源出现瓶颈）

2：查看日志切换频率，一般建议15-20分钟切换一次。

3：根据经验值user calls/user commits至少不小于25，这个值主要和应用设计有关，比如有过多的小事物，提交过于频繁等。

4：可能和其他等待事件有关。

https://support.oracle.com/epmos/faces/DocContentDisplay?_afrLoop=241811194407622&id=34592.1&_afrWindowMode=0&_adf.ctrl-state=11enkcwiqc_53

减少等待时间：

下面优化建议，有助于减少log file sync等待：

(1) 优化LGWR速度，以获得良好的磁盘吞吐量。例如：redo log file不要放在RAID 5上(可以考虑RAID 0或RAID 1+0)；

(2) 如果有大量小事物，最好可以批量提交，减少提交次数；

(3) 异步提交，特定场景可以考虑使用COMMIT NOWAIT选项(谨慎使用)(commit_wait={nowait|wait|force_wait}默认nowait)；

(4) 特定场景可以考虑使用NOLOGGING / UNRECOVERABLE选项(谨慎使用)；

(5) 保证redolog足够大，确保日志切换间隔在15-20分钟；

http://www.itpub.net/thread-1777234-1-1.html

(6) 使用稳定版本数据库避免bug，具体bug修复的版本参考文档；

http://www.itpub.net/thread-2090095-1-1.html

(7) 在Oracle 11.2.0.3以下，建议关闭自适应log file sync,务必让lgwr进程运行在post/wait机制下，以确保数据库性能不会出现大的抖动。

alter system set "_use_adaptive_log_file_sync"=false sid='*';

http://www.itpub.net/thread-1777234-1-1.html

来自吕大师（vage）对log file sync等待事件优化的总结，供各位puber们学习参考：

1、Log File Sync是从提交开始到提交结束的时间。Log File Parallel Write是LGWR开始写Redo File到Redo File结束的时间。明确了这一点，可以知道，Log file sync 包含了log file parallel write。所以，log file sync等待时间一出，必先看log file parallel write。如果log file sync平均等待时间（也可称为提交响应时间）为20ms，log file parallel write为19ms，那么问题就很明显了，Redo file I/O缓慢，拖慢了提交的过程。

2、Log File Sync的时间不止log file parallel write。服务器进程开始提交，到通知LGWR写Redo，LGWR写完Redo通知进程提交完毕，来回通知也是要消耗CPU的。除去来回通知外，Commit还有增加SCN等等操作，如果log file sync和log file parallel write差距很大，证明I/O没有问题，但有可能是CPU资源紧张，导致进程和LGWR来回通知或其他的需要CPU的操作，得不到足够的CPU，因而产生延迟。

这种情况下要看一下CPU的占用率、Load，如果Load很高、CPU使用率也很高，哪就是由于CPU导致Log file sync响应时间加长。这种情况下，数据库通常会有一些并发症，比如Latch/Mutex的竞争会比平常严重些，因为CPU紧张吗，Latch/Mutex竞争一些会加巨的。

3、log file sync和log file parallel write相差很大，但CPU使用率也不高，这种情况比较少见，这就属于疑难杂症范畴了。I/O也很快，CPU也充足，log fie parallel write响应时间很短，但log file sync响应时间确很大。这是最难定位的情况，可以全面对比下Redo相关资料(v$sysstat中的资料）、Redo相关Latch的变化情况。
比如，redo synch time的平均响应时间，不是每次redo synch time都有提交，但每次提交必有redo synch time。如果redo synch time向应快，而log file sync慢，则说明Lgwr和进程的互相通知阶段出了问题。还有redo entries，这个Redo条目数，真正含意是进程向Log Buffer中写Redo的次数。redo log space wait time、redo log space requests资料和Log Buffer Space等待事件也要关注下。Log Buffer的大小通常不会影响Log File Sync，但通过Log Buffer的变化，可以了解Redo量的变化。
关于Log Buffer对Log File Sync的影响，

在新IMU机制下，Redo数据先在共享池中，提交时传到Log Buffer中，如果这时有等待，等待事件是Log Buffer Space。从Log Buffer到磁盘，等待事件才是log file sync。
老机制下也一样，Log Buffer之前的等待是log buffer space，log buffer之后的等待才是log file sync。

三：log file sync官方文档

https://docs.oracle.com/cd/E11882_01/server.112/e40402/waitevents003.htm#REFRN00585

log file sync

When a user session commits, the session's redo information needs to be flushed to the redo logfile. The user session will post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session.

Wait Time: The wait time includes the writing of the log buffer and the post.

https://docs.oracle.com/cd/E11882_01/server.112/e41573/instance_tune.htm#PFGRF94534

10.3.16 log file sync

When a user session commits (or rolls back), the session's redo information must be flushed to the redo logfile by LGWR. The server process performing the COMMIT or ROLLBACK waits under this event for the write to the redo log to complete.

Actions

If this event's waits constitute a significant wait on the system or a significant amount of time waited by a user experiencing response time issues or on a system, then examine the average time waited.

If the average time waited is low, but the number of waits are high, then the application might be committing after every INSERT, rather than batching COMMITs. Applications can reduce the wait by committing after 50 rows, rather than every row.

If the average time waited is high, then examine the session waits for the log writer and see what it is spending most of its time doing and waiting for. If the waits are because of slow I/O, then try the following:

Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.

Alternate redo logs on different disks to minimize the effect of the archiver on the log writer.

Move the redo logs to faster disks or a faster I/O subsystem (for example, switch from RAID 5 to RAID 1).

Consider using raw devices (or simulated raw devices provided by disk vendors) to speed up the writes.

Depending on the type of application, it might be possible to batch COMMITs by committing every N rows, rather than every row, so that fewer log file syncs are needed.

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=193274707595560&parent=DOCUMENT&sourceId=1376916.1&id=34592.1&_afrWindowMode=0&_adf.ctrl-state=dcxruipix_268

WAITEVENT: "log file sync" Reference Note (文档 ID 34592.1)

https://support.oracle.com/epmos/faces/DocContentDisplay?_afrLoop=190824663870786&id=1376916.1&_afrWindowMode=0&_adf.ctrl-state=dcxruipix_145

Troubleshooting: 'Log file sync' Waits (文档 ID 1376916.1)

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=248505450835960&id=1064487.1&_afrWindowMode=0&_adf.ctrl-state=it4a48l6o_185

Script to Collect Log File Sync Diagnostic Information (lfsdiag.sql) (文档 ID 1064487.1)

来自 “ ITPUB博客 ” ，链接：http://blog.itpub.net/29785807/viewspace-2155291/，如需转载，请注明出处，否则将追究法律责任。

转载于:http://blog.itpub.net/29785807/viewspace-2155291/