关于Thread cannot allo…

A common problem for people with very active systems who use filesystems to store their datafiles is the error, “Thread cannot allocate new log, sequence ; Checkpoint not complete” The most commonly recommended remedies for this situation are either to use larger or more online redologs.

Unfortunately, if ‘checkpoint not complete’ is a chronic problem, neither of these solutions will eliminate the problem. They may forestall, or even reduce the frequency of the error, but the problem will not be solved.

‘Checkpoint not complete’ is a result of an instance filling and switching through all available online redologs before one checkpoint can complete. Because Oracle must always be able to recover from instance failure from the last checkpoint forward, the rules of recovery prevent an instance from reusing any online redolog that contains changes newer than the last checkpoint.

A checkpoint is the writing of all dirty blocks in the buffer cache to the datafiles as of a particular SCN. In general, if a checkpoint cannot complete in a timely fashion, it is a result of slow I/O to the datafiles. The possible solutions to this problem seek to eliminate the I/O bottleneck, or compensate for a slow I/O subsystem.

The problem with the recommendations to increase the size or number of redologs is that if the rate of DML activity is so high that checkpoints cannot keep up, there is no reason to think that by increasing the amount of online redologs, that it will make the I/O subsystem any more able to keep up. That is to say, it will take a longer time to fill up all the online logs, but the basic problem of not being able to write out dirty blocks as fast as the database is changing will still be there.

The first step in diagnosing ‘checkpoint not complete’ is to determine if the problem is chronic or not. If the error appears in the alert log many times a day, or consistently during peak hours, then the problem is chronic. If the error appears at the same time every day or every week, or if the problem is only occasional, it is not chronic.

Non-chronic ‘checkpoint not complete’ probably doesn’t require any re-engineering of the systems architecture. It is most likely the result of a single application suddenly making a large amount of DML (inserts, updates, deletes) to the database in a short time. The best way to solve this problem is to find out if the application can reduce its generation of redo by performing its changes ‘nologging.’ Any bulk inserts can be done using append mode unrecoverable, and generate no significant redo. Deletes that clear a whole table or a whole class of records can be converted to truncates of the table or of a partition. It very least, the application can be modified to throttle the rate of change back to a rate that the I/O subsystem can keep up with. Even the crude solution of increasing the number or size of redologs may solve sporadic, non-chronic occurrences of ‘checkpoint not complete.’

Chronic ‘checkpoint not complete’ is a more complicated problem. It means that overall, the rate of DML of the instance is higher than the I/O subsystem can support. In systems with chronically slow I/O, application performance will be degraded, because the buffer cache is not purged of dirty blocks fast enough or frequently enough. Such systems show relatively long time_waited for the “buffer busy wait” and “write complete wait” events in v$system_event. The solution to such a problem is either to compensate for the problem by making the checkpoint more aggressive, or to solve the problem by making the I/O more efficient.

To understand the solution to this problem, it is first necessary to understand something about how checkpoints work. When a periodic checkpoint is being performed, a certain portion of the database writer’s capacity, or “write batch,” is made available for the checkpoint to use. If the checkpoint can’t complete in time, it is valid to infer that Oracle is not using enough of the database writer’s write batch for the checkpoint, and that it should probably use a larger portion. Note that none of this has anything to do with the CKPT background process. Checkpoints are performed by the database writer. CKPT just relieves the log writer from updating file header SCNs when checkpoints complete.

In Oracle8, a new feature, sometimes called “incremental checkpoint” or “fast start instance recovery” was introduced. This feature is enabled with the initialization parameter FAST_START_MTTR_TARGET in 9i (FAST_START_IO_TARGET in 8i), and completely changes the behavior of Oracle checkpointing. Instead of performing large checkpoints at periodic intervals, the database writer tries to keep the number of dirty blocks in the buffer cache low enough to guarantee rapid recovery in the event of a crash. It frequently updates the file headers to reflect the fact that there are not dirty buffers older than a particular SCN. If the number of dirty blocks starts to grow too large, a greater portion of the database writer’s write batch will be given over to writing those blocks out. Using FAST_START_MTTR_TARGET is one way to avoid ‘checkpoint not complete’ while living with a chronically slow I/O subsystem.

In Oracle7, although there is no incremental checkpoint feature, there is an “undocumented” initialization parameter that can be set to devote a larger portion of the write batch to checkpoints when they are in progress. The parameter is _DB_BLOCK_CHECKPOINT_BATCH, and to set it you need to find out the size in blocks of the write batch and the current checkpoint batch. This can be obtained from the internal memory structure x$kvii.

Another way to compensate for slow I/O is to increase the number of database writers. By dedicating more processes to writing the blocks out, it may be possible to allow checkpoints to keep up with the rate of DML activity on the database. Bear in mind that certain filesystems, such as AdvFS on Compaq Tru64 Unix, obtain no benefit, from multiple database writers. Such filesystems exclusively lock a file for write when any block is written to that file. This causes multiple database writers to queue up behind each other waiting to write blocks to a particular file. Oracle has provided notes on Metalink regarding such filesystems.

If you are more inclined to address the root cause of the problem than to compensate for it, then there are a few measures you can take. Oracle supports asynchronous I/O on most platforms, if datafiles are stored in raw or logical volumes. Conversion to raw or LVs requires significant engineering, but is much easier than totally replacing the storage hardware. Using asynchronous I/O also relieves the aforementioned file-locking bottleneck on certain types of filesystems.

I/O hardware upgrade or replacement is the most complex approach to solving the problem of slow I/O. Using RAID disk arrays allows data to be “striped” across many disks, allowing a high rate of write-out. Caching disk controllers add a battery-protected cache for fast write-out of data.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
项目:– JavaScript 中的患者数据管理系统 患者数据管理系统是为医院开发的 node JS 项目。通过使用此系统,您可以轻松访问患者数据,它具有成本效益,可改善患者护理和数据安全性。不仅如此,它还减少了错误范围。在运行项目之前,您需要下载 node.js。 这个患者数据管理项目包含 javascript、node.js 和 CSS。我们必须让服务器监听端口 3000,并使用 JSON 在客户端和服务器之间交换数据。这个项目会不断询问您有关插件更新的信息,因此请保持互联网畅通。此系统允许您执行 crud 操作。在这里,您是系统的管理员。您还可以添加所需的员工人数。此外,您还可以更新患者记录。该系统功能齐全且功能齐全。 要运行此项目,您需要在计算机上安装NodeJS并使用现代浏览器,例如 Google Chrome、  Mozilla Firefox。ReactJS项目中的此项目可免费下载源代码。有关项目演示,请查看下面的图像滑块。 对于手动安装 1.将主项目文件夹解压到任意目录 2.从 cmd 设置项目目录的路径 3. 输入命令“npm install” 4.完成后输入命令“npm start” 5.现在,您将获得一个 localhost:portnumber,并转到该 URL 演示: 该项目为国外大神项目,可以作为毕业设计的项目,也可以作为大作业项目,不用担心代码重复,设计重复等,如果需要对项目进行修改,需要具备一定基础知识。 注意:如果装有360等杀毒软件,可能会出现误报的情况,源码本身并无病毒,使用源码时可以关闭360,或者添加信任。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值