Chapter 5. Instance Locks

Chapter 5. Instance Locks
Instance locks are used for inter-instance locking and communication between the instances of an Oracle parallel server database. Although instance locks are scarcely used in single-instance Oracle, I encourage all readers to browse this chapter anyway. Single-instance Oracle is really just a special case of parallel server, and there are some aspects of its operation that you will find difficult to grasp unless you understand the general case. If nothing else, rea
5.1 The Lock Manager
The part of Oracle that manages instance locks is called the lock manager. The lock manager is a layer of functionality that affects the operation of all processes. However, its most obvious manifestations are the presence of a set of lock management processes, and an in-memory database of instance lock information in each instance.
The lock manager is said to be distributed. There is no central point of control. Each instance only maintains information about the instance locks in which it has an interest. The lock manager is also said to be integrated. This is because, prior to Oracle8, a separate product provided by the operating system vendors was required for lock management. In Oracle8, release 8.0, this functionality was incorporated into the Oracle kernel.
5.1.1 The Instance Lock Database
The lock and resource structures for instance locks reside in a dedicated area within the SGA called the instance lock database. The lock and resource arrays are dimensioned by the LM_LOCKS and LM_RESS parameters. A third parameter, LM_PROCS, dimensions the array of processes and instances that can own the locks. This array needs one slot for each local process and one slot for each remote instance.
The instance lock database also includes an array of process groups. In some cases, instance locks may be owned by a group of processes, rather than a single process. Group lock ownership allows Multi-Threaded Server sessions to migrate between shared server processes, and allows OCI transactions to be detached from one process and resumed by a different process. All lock acquisition requests can specify either process or group ownership. The group membership of processes is inferred and tracked automatically in the instance lock database. Exchanges of group-owned instance locks within the process group do not require any further lock acquisition or conversion. The size of the process groups array is set by the _LM_GROUPS parameter, which defaults to 20.
The instance lock database contains many other structures besides the resources, locks, processes, and process groups. There are hash tables for access to many of these arrays; structures for recording statistics, managing waits and timeouts, checking for deadlocks, and performing recovery; and also a large portion of memory to hold the message buffers used for inter-instance communication. The number of buffers is set by the _LM_SEND_BUFFERS parameter, which defaults to 10,000.
Most parts of the instance lock database are fixed in size from instance startup. However, Oracle has the option of allocating memory from the shared pool for additional dynamically allocated resources and locks if necessary. If this occurs, a message is written to the alert log, and the corresponding parameter should be increased prior to the next database startup, unless the overrun was due to the recovery of another instance. The GV$RESOURCE_LIMIT view contains statistics about the number of dynamic resources and locks allocated, as does GV$DLM_MISC . Note that the dynamically allocated memory is never released back into the shared pool.

5.1.2 Lock Mastering
The instance lock database is a distributed database. No single node tracks all the locks on all the resources. For each resource there is a master node. The master node for a resource maintains complete information about all the locks on that resource. Other nodes need only maintain information about locally held locks on that resource. For dynamically allocated resources, the master node is normally the first node to take a lock on the resource. There is also a directory node for each resource, which maintains a pointer to the master node. The directory node is determined using a hash function based on the resource identification. For persistent resources, the master node is always the directory node.
Whenever the set of active instances changes due to instance startup or shutdown, or due to the failure of an instance or node, then the distribution of resources to nodes must be changed. In general, both the directory node and the master node for each resource might change, and the required information must be reconstructed at each node. If an instance has failed, then the roll forward phase of instance recovery (called cache recovery) must also be completed before all instance lock information can be validated.
The instance lock database is frozen during both resource redistribution and cache recovery, if applicable. During this time, local activity may continue, but only under the instance locks that were already held. However, if an instance has been lost, local activity is limited to read-only access to data already in memory, and for which an instance lock was already held. This is an extremely severe constraint. You should attempt to limit the time required for resource redistribution and instance recovery roll forward by using modest numbers of resources and locks, and by configuring checkpoint activity to ensure bounded recovery times.

5.1.3 Lock Handle Acquisition
Many instance locks are acquired in two steps. The first step is to obtain a lock handle, which is an identifier of the lock to be used in subsequent conversion or release operations. Some instance locks are held for the life of the instance and are never converted or released. These locks are acquired in a single step, and no lock handle is returned.
Although the lock manager has been integrated into the Oracle kernel, processes needing a lock handle do not access the instance lock database to allocate a lock handle directly. Instead, they still construct and send a message to the LMDn background process, and wait for LMDn to return a lock handle. The message identifies the resource, and sets certain options that govern both the acquisition of the lock and its subsequent management.
If the resource does not exist in the local instance lock database, then a slot is allocated from the instance lock resource table. The resource may be marked as persistent, if it is to be retained once all locks have been released. The directory node is computed from the resource identification, and the master node is marked as unknown (unless the resource is persistent). Once the resource exists in the local instance lock database, a slot in the instance lock table is allocated to the lock. Its process or group ownership is established, and deadlock detection parameters are set. The LMDn background process then constructs and sends a reply message to the client process. This message is called an acquisition asynchronous trap, or acquisition AST. The acquisition AST message includes a lock handle.
Processes waiting for LMDn to return a lock handle wait on a DFS lock handle wait. DFS stands for Distributed File System, which is an old name for Oracle's instance lock management functionality. Lock handle waits should be brief, because they are resolved entirely locally. If these waits take longer than 1 centisecond on average, then the LMDn process is overworked.

5.1.4 Instance Lock Acquisition
Once a lock handle has been obtained, the process needing an instance lock constructs and sends a second message to LMDn to convert the lock. This message identifies the lock handle, specifies the new lock mode required, and sets further options. If the master node for the resource is already known, this message may be sent directly to the LMDn process at the master node depending on the setting of the _LM_DIRECT_SENDS parameter which defaults to ALL in release 8.1, but just to LKMGR in release 8.0.
If the master node for the resource is still unknown, the local LMDn sends a message to the directory node to find out which node is the master node for this resource. If a master node has not already been assigned, the directory node assigns a master node. Depending on the resource type, the lock mastering is either assigned to all active nodes cyclically, or to the originating instance if the resource is unlikely to be used from other instances. Once the master node is known, the acquisition or conversion request can be forwarded to the master node.
If the lock information held at the master node indicates that the lock can be granted immediately, then the lock is linked into the granted queue at both the master node and locally, and a conversion AST message is returned to the client process via the LMDn process of the client instance. Otherwise, the lock is linked to the convert queue for the resource, and the client process continues to wait.
When a lock request is blocked, the LMDn process at the master node may ask the blocking lock holders to downgrade the modes of their locks on the resource, in order to allow the new conversion request to be granted. This is done by sending a blocking asynchronous trap, or BAST, to the blocking processes and instances. Whether a lock holder is able to receive BAST messages, and the level to which it may be willing to downgrade its lock, are set during lock acquisition or conversion. When a blocking process has complied with a BAST, it sends a completion AST in reply.
The GV$DLM_LOCKS view shows the details for all blocked and blocking locks in the instance lock database, including all the options set when acquiring and converting the locks. GV$DLM_ALL_LOCKS shows the same details, but for all instance locks, including those held in null mode.

5.1.5 LCKn Processes
Many instance locks are not obtained directly by the process requiring the lock. Instead, the LCKn processes obtain them on behalf of the entire instance. The LCKn processes operate asynchronously. That is, when they send requests to the LMDn process, they do not wait for an acquisition or conversion AST to be returned. Instead, they are available to handle further lock requests from other processes. This is why a distinction is made in the GV$SYSSTAT statistics between asynchronous and synchronous global lock gets and conversions.
By default, only one LCKn process is started. This is normally sufficient, because it operates asynchronously. However, if LCK0 is very active, and if the operating system does not support priority fixing, then LCK0 may have to queue for the CPU, thereby degrading overall system performance. If so, multiple LCKn processes can be started using the _GC_LCK_PROCS parameter.

5.1.6 Lock Value Blocks
When a process acquires or converts an instance lock, it can read or write the 16-byte lock value block which is maintained in the resource structure at the master node. For example, the lock value block of the SM (SMON) instance lock resource represents the last time an SMON process ran in any instance. The lock value block facility is also used to communicate System Change Numbers (SCNs) between instances, and to establish parallel execution communication paths. However, the lock value block is not used for most resource types.
Incidentally, the resource structure for local enqueues also includes a lock value block, but it is scarcely ever used.


5.2 Global Locks
Many of the resources protected by local locks in single-instance Oracle require global exposure in a parallel server database. Whenever one of these local locks is needed, a corresponding instance lock must be held as well, to protect the resource across all instances. The instance locks used to protect local locks globally are called global locks . However, the term is sometimes used informally as a synonym for all instance locks generally.
5.2.1 Row Cache Instance Locks
Row cache instance locks correspond directly to local row cache enqueue locks. They do not supersede the local locks, but give them global exposure.
When a process needs a row cache instance lock, it posts the LCK0 background process to obtain the lock on behalf of the instance, and waits on a row cache lock wait. This same wait is also recorded when waiting for the corresponding local lock. LCK0 obtains the instance lock asynchronously. When LCK0 receives the acquisition or conversion AST from LMDn, it posts the waiting process.
When the local process releases its row cache enqueue lock, the dictionary row remains cached, and so the instance lock is not released but downgraded to null mode in the background by the LCK0 process. However, the row cache instance lock is released if the dictionary cache entry is flushed from the shared pool.
Although dictionary cache entries and row cache enqueue locks are dynamically allocated in the shared pool, the lock state information for the corresponding instance locks is not. That information is maintained in a fixed array which is dimensioned by the _ROW_CACHE_INSTANCE_LOCKS parameter. The size of this array limits the number of null mode instance locks cached by each instance, and thus constrains the resource usage in the instance lock database for row cache instance locks. Consider increasing the size of this array to cache a working set of instance locks if the GV$ROWCACHE view shows ongoing DLM_RELEASES without many DLM_CONFLICTS.

5.2.2 Global Enqueues
Most of the resources protected by enqueue locks in single-instance Oracle have global exposure in a parallel server database. These are the global enqueues.
Global enqueue locks are taken by the foreground and background processes taking the local locks. They are not taken by the LCKn processes on behalf of the instance. The instance lock resources for global enqueues are dynamically assigned, and like the local enqueue resources, they are not persistent. Most global enqueue resource types are mastered locally, because locks on these resources are seldom needed by other instances.
The single most effective way to optimize global enqueue locking is to disable table locking. Indeed, this is strongly recommended for Oracle parallel server. The preferred way of doing this is to use the ALTER TABLE DISABLE TABLE LOCK command on all application tables, rather than setting the DML_LOCKS parameter to 0, as discussed in the previous chapter.

5.2.3 Cross-Instance Calls
One global enqueue type is worthy of particular mention because of its role in inter-instance communication. Some operations, such as changing the backup state of tablespaces, log file operations, global checkpoints, and others, need global coordination because all the instances must cooperate in performing the operation.
This communication between instances is effected using CI (cross-instance call) enqueues . For each type of operation, the background processes of each instance hold instance locks on a set of resources. By manipulating the modes of these locks, it is possible to trigger global actions and wait for them to be completed.
For example, prior to performing a direct read operation on a database segment, the reader process or parallel query coordinator uses a cross-instance call to the DBWn processes requesting a checkpoint of all dirty cached blocks belonging to that segment. The lock value block is used to communicate the database object number for the segment. Similarly, before truncating a segment, reuse block range cross-instance calls are used to ensure that dirty cached blocks within the affected range have been flushed to disk, and that clean cached blocks within the affected range have been invalidated.
Despite their name, many cross-instance calls are used, and the corresponding CI enqueues are taken in single-instance Oracle as well as in a parallel server.

5.2.4 Library Cache Instance Locks
Some library cache locks and pins also have global exposure in a parallel server database. Remember that library cache locks are held during parse calls, and that pins are held during execute calls, to prevent conflicting DDL. In parallel server databases it is necessary to prevent such conflicting DDL in all instances. To achieve this, it is sufficient to globally expose the library cache locks and pins on database objects only. The local locks and pins on dependent objects such as cursors do not need global exposure, because they are indirectly protected if all the database objects on which they depend are protected.
Remember further that local library cache locks are retained in null mode to invalidate cached library cache objects should the definition of an object on which they depend be changed. The same functionality is provided between instances, on database objects only, by the LCK0 processes holding an IV (invalidation) instance lock in shared mode on all database objects cached in the library cache. Any process that needs to invalidate an object definition globally merely takes an exclusive mode lock on the same resource, thereby causing the LCK0 processes to drop their shared lock and invalidate the object.
The use of global library cache locks, pins, and invalidation locks can be disabled using hidden parameters. This is not recommended unless DML locks have also been completely disabled.


5.3 PCM Instance Locks
Parallel cache management (PCM) instance locks do not protect cache buffers—at least not directly. They protect data structures known as lock elements. Each lock element protects a set of data blocks, not cache buffers. However, any cache buffers containing those data blocks are linked to their lock element.
Lock elements are also called global cache locks, but that term is unhelpful because they are neither locks nor resources. They are an intersection entity between PCM instance locks and cache buffers.
5.3.1 Fixed Lock Elements
Lock elements are either releasable or fixed. Releasable lock elements may be used for either hashed or fine-grained locking, but fixed lock elements are used only for hashed locking. In hashed locking, data blocks are mapped to lock elements using a hash algorithm, and a single lock element may protect any number of cached blocks at once. In fine-grained locking, lock elements are dynamically allocated to protect a single cached block at a time.
The mapping of data blocks to hashed lock elements, and whether they are fixed or releasable lock elements, is established by the GC_FILES_TO_LOCKS and GC_ROLLBACK_LOCKS parameters. The number of fixed lock elements is derived from these strings. The number of releasable lock elements used for hashed locking in these strings must be less than the number of releasable lock elements specified with the GC_RELEASABLE_LOCKS parameter, which defaults to the number of buffers in the cache. The remaining releasable lock elements are available for fine-grained locking.
Fixed and releasable hashed locking exhibit identical performance, except in one very important respect. During instance startup, the LCKn processes must acquire null mode instance locks on all fixed lock elements. This can take many minutes. Releasable hashed locking diffuses this cost over an initial ramp-up phase of instance activity. Thereafter, the performance of these two forms of hashed locking are identical. Note that the lock handles on releasable hashed lock elements are never actually released, despite the lock elements being releasable. In view of this, you should only use releasable lock elements for hashed locking, and should not use fixed lock elements at all.

5.3.2 Hashed Locking
When a block is brought into the cache, the lock element under which it will be protected must be determined, and the buffer must be linked to that lock element. How this is done depends on whether the block uses fine-grained or hashed locking, and in the case of hashed locking, it also depends on the class of the block. Rollback segment blocks and data blocks are treated separately.
For data blocks subject to hashed locking , there is an index array that maps file numbers to lock element buckets, and a bucket header array that identifies the series of lock elements in each bucket. These arrays are visible as X$KCLFI and X$KCLFH . For rollback segment blocks, the corresponding arrays are X$KCLUI and X$KCLUH . These arrays are constructed from the GC_FILES_TO_LOCKS and GC_ROLLBACK_LOCKS parameters during instance startup.
When a rollback segment block or data block is brought into cache, these arrays are used to look up the correct lock element bucket. No latching is necessary, because the arrays are static. The lock element chosen to protect a particular block from within its bucket is the block number minus two, divided by the blocking factor for the bucket, divided by the number of lock elements in the bucket, rounded down. This hash function subtracts two from the block number, rather than one, to allow for the file header block and to thereby ensure that if the blocking factor is chosen as an integer divisor of the extent size expressed in blocks, then lock element coverage will align to extent boundaries. Further, it is good practice to make the number of lock elements in each bucket a prime number to ensure an even distribution of blocks to lock elements, regardless of the data distribution within database segments.
A heavier concentration of lock elements should be allocated to data files that may be subject to contention for hashed PCM instance locks. The risk of such contention is greatest on data files whose blocks are subject to change and whose blocks are accessed from multiple instances.

5.3.3 Fine-Grained Locking
Data files and rollback segments that are not assigned any hashed lock elements in GC_FILES_TO_LOCKS and GC_ROLLBACK_LOCKS, or that are explicitly given no lock elements, use fine-grained locking unless an alternative default bucket (bucket 0) has also been defined. In fine-grained locking each data block is protected by a dedicated lock element from the set of unassigned releasable lock elements. Because just one data block is protected by each lock element at any one time, fine-grained locking is also called DBA (data block address) locking. Fine-grained locking is also used for all block classes other than data blocks and rollback segment blocks. These minor block classes include segment header blocks, free list blocks, and extent map blocks.
If fine-grained locking is being used for certain data files, and if important minor class blocks such as segment header blocks are often aged out of the buffer cache, then the lock elements for those blocks may be reused before their blocks are read in again. This results in unnecessary instance lock acquisition and resource allocation overhead. In this case, to improve the retention of instance locks, you should consider reserving a number of lock elements in a separate bucket for the minor class blocks by setting the _GC_CLASS_LOCKS parameter.

5.3.4 The Lock Element Free List
When a block subject to fine-grained locking is brought into the cache, a hash table is consulted to determine whether a lock element for the block has been preserved. This is done under the protection of the KCL name table latch . If necessary, a lock element is allocated from the LRU (least recently used) end of the lock element LRU chain under the protection of the KCL freelist latch . Note that despite the name of this latch, the data structure that it protects is an LRU chain of both free and in use (named) lock elements. The term, lock element free list, is just another name for the set of free lock elements at the end of the lock element LRU chain.
In most instances, it is desirable to have lock elements available on the free list at all times. The X$KCLFX fixed table contains some free list statistics. In particular, the LWM column contains the low-water mark of the free list length. This can be seen with the APT script lock_element_lwm.sql . Lock elements are returned to the MRU (most recently used) end of the LRU chain when their protected buffer is reused. Lock elements may be reclaimed from the free list if the protected block is brought back into cache before the lock element has been reused.
In very large memory (VLM) environments, it may be desirable to have fewer lock elements available than the number of cache buffers under fine-grained locking. It is not that there is not ample memory available for the lock elements and instance locks, but that having a large number of instance locks would greatly extend the period of reduced availability during instance lock redistribution when necessary. In such environments, named lock elements are reused in LRU order as required. If a process has to wait while a lock element is prepared for reuse, it waits on a global cache freelist wait . The only parameter to this wait is the lock element number.

5.3.5 PCM Lock Acquisition
When a block is brought into cache, a buffer from the buffer cache LRU chain is selected for reuse, and the session allocates a buffer handle to work with the buffer. First it must unlink the buffer header from the lock element under which the previous block cached in that buffer was protected, if any. Then it must link the buffer header to the lock element for the new block. These operations are performed under the protection of the KCL lock element parent latch for that lock element.
If multiple LCKn processes have been configured, then the lock element array is partitioned between these processes, and a separate set of lock element parent latches is used for each partition. The number of latches in each set is determined by the _GC_LATCHES parameter, which defaults amply to two times the CPU count.
Of course, more needs to be done than merely linking the buffer for a new block to its lock element. In particular, a PCM instance lock of the correct mode needs to be acquired on the lock element. The LCKn background processes perform this task. For fine-grained locks, they must also release the instance lock and resource previously held for that lock element.
When a foreground process needs to acquire or convert a PCM instance lock on a particular lock element, it allocates a structure called a lock context object . The lock context object is linked to the lock element, and fully describes the operation to be performed. The foreground process then posts the LCKn process and waits for the LCKn process to complete the locking operation and clean up the lock context object.
While the foreground process is waiting for LCKn to obtain a lock, it sleeps on one of the global cache lock waits. The timeout for these waits is 1 second. The parameters are as shown in Table 5.1.
Table 5.1. Wait Parameters (global cache lock waits)
Parameter Description
p1  The file number of the database block. 
p2  The block number of the database block in its file. 
p3  The lock element number, or the block class for minor class block lock acquisitions. 

While a process is waiting on a global cache lock wait because a blocking lock is held by a remote instance, the details can be seen in GV$DLM_LOCKS . The resource name is constructed from the lock type and the two lock identifiers. The lock type for PCM instance locks is BL (block lock) . The first identifier is the lock element number for hashed locking, and the data block address for fine-grained locking. The second identifier is the block class, as shown in Table 5.2.

 

 

5.3.6 Block Pings
If a remote instance needs a PCM instance lock in an incompatible mode with the lock held locally, then the LCKn process holding that lock will receive a BAST from the local LMDn process. If none of the blocks protected by that lock element are currently in the cache, or if they are in cache but in a compatible state, then the LCKn process can downgrade its lock mode immediately. However, if any blocks protected by the lock element are in cache in an incompatible state, then the lock cannot be downgraded until after the block states have been changed. Changing the state of a cached block in response to a BAST is called a ping.
Cached blocks can be in three possible states.[1] First, they can be current or stale. Stale copies of blocks are kept in the cache to satisfy long-running queries. Queries need to perform consistent reads. That is, blocks that have been modified since the query started need to be rolled back, so that the information returned by the query will reflect a consistent snapshot of the database at the time that the query or transaction began. Retaining stale copies of blocks in cache for consistent reads reduces the need to roll back changes for queries. Because of this, stale copies of blocks that are still in the cache are said to be in consistent read (CR) state. Note that CR is also the abbreviation for the concurrent read lock mode, which can be cause for confusion at times.
[1] Note that I am speaking here of block states. The states of the buffers containing those blocks are related, but different.
Cached blocks that are not stale are current. Current blocks can be in two states, namely clean or dirty. A current block is dirty if it has been changed and still has to be written to disk. A current block is clean if it does not contain changes that still have to be written to disk.
Pings only affect current blocks. If a remote instance requires a shared lock on a lock element, then any dirty blocks protected by that lock element locally need to be written to disk and thus cleaned. When these blocks have been cleaned, the local lock on that lock element can be downgraded to shared mode. However, if the remote instance needs to change a block, then it will request an exclusive lock on its lock element. Any dirty blocks protected by that lock element locally need to be written to disk, and clean blocks must be marked as stale; that is, they must be converted to consistent read state. The local lock on the lock element can then be downgraded to null mode.
Pings that affect dirty blocks and cause them to be written to disk are called hard pings. Pings that only affect the state of blocks, by causing them to be marked as stale, are called soft pings. Hard pings are a form of forced write. Forced writes also occur in response to checkpoint object, reuse block range, and write buffer cross-instance calls. Soft pings are a form of cache invalidation, which is forcing a block to become stale. This term reflects the fact that the cache buffer no longer contains a valid current copy of the block. Cache invalidations also occur in response to reuse block range cross-instance calls. Forced reads, as shown in GV$BH , are cases when an instance had to read a block back into cache after it was invalidated.
Under hashed locking, it is possible for multiple cached blocks to be affected by a single ping. Similarly, it is possible for blocks other than the block required by the remote instance to be affected by a ping. Pings of blocks other than the block required by the remote instance are called false pings . True pings are those in which the only block affected is the block required by the remote instance. Fine-grained locking is not subject to false pings, because only one block is protected under each lock element.
Pings are a major performance issue for parallel server databases. The fixed table GV$FILE_PING contains detailed statistics about pings that have occurred for each data file, as well as other forced writes and invalidations. This information is invaluable in pinpointing trouble spots while tuning a parallel server database to reduce pings.

5.3.7 Consistent Read Requests
Oracle uses several optimizations to reduce the number of pings and their impact. Queries only need consistent read copies of the data blocks, not necessarily the current block image. If a stale copy of the block that is more recent than the consistent read SCN for the query is available in the local cache, then that copy will be used. If the lock element is not locked in exclusive mode by another instance, then a shared mode lock is taken and the block is read from disk and rolled back as required. However, if an exclusive lock is held by another instance, Oracle must obtain a suitable read consistent copy of the block from that instance. How this is done depends on the Oracle release.
In release 8.0, Oracle first attempts to ping the block. However, if the block is very hot in the remote instance, the ping request times out after 6 seconds (or the value of the _CR_DEADTIME parameter). In this case, Oracle uses a write buffer cross-instance call to trigger the remote DBWn to write the current buffer to disk. The block can then be read from disk and rolled back as required. However, the rollback operation commonly requires several more calls for rollback segment blocks from the remote instance.
In release 8.1, Oracle uses an alternative cross-instance call to trigger the remote block server process to construct the required consistent read copy of the block and transfer it directly to the client instance. If, however, the remote instance no longer has a current copy of the block in its cache, then the client instance is given permission to read the current image of the block from disk and perform the required rollback itself. This is reflected in the global cache cr blocks received and global cache cr blocks read from disk statistics.
Oracle plans to extend the block service feature to include transfers of current mode blocks in a later release.

5.3.8 Deferred Ping Responses
Another optimization that Oracle uses to reduce the impact of pings is to defer its response to hard pings by 10 centiseconds, or by the setting of the GC_DEFER_TIME parameter. This is often long enough to allow the active transaction to complete its current series of changes to the block, and mark them as complete within the block header, so that the remote instance will not have to check that transaction's status immediately after reading the block. Checking the status of a remote transaction is an expensive operation, because it requires a ping of the transaction's rollback segment header block, which is invariably a very hot block.
Tuning GC_DEFER_TIME is a matter of balancing the number of pings against the response time for pings, and it can be tuned conveniently because it is a dynamic parameter with ALTER SYSTEM scope. However, local operations on a lock element may be delayed unduly if pings take too long to resolve. In this case global cache lock busy waits will be observed. The timeout for this wait is one second, and the wait parameters are the same as for the other global cache lock waits.
Another optimization that Oracle uses to reduce the impact of pings is to automatically queue a conversion request to restore its lock mode after a ping. This can be disabled in release 8.0 using the _UPCONVERT_FROM_AST parameter. Similarly, Oracle sometimes takes an exclusive lock earlier than necessary to reduce the number of lock conversions. This can be disabled using the _SAVE_ESCALATES parameter. These parameter settings should not normally be changed.

5.3.9 Workload Partitioning
Of course, the best way to reduce pings is to partition the workload so the instances use mutually exclusive sets of data. With a little imagination, and a lot of hard work, it is possible to partition most workloads satisfactorily. One approach, for example, is to use a three-tier architecture with a TP monitor and Oracle's XA libraries to direct global transaction branches to distinct instances based on the data set required.
Another approach worthy of extended consideration is to embrace distributed database technology, in preference to parallel server technology. The overheads of instance locking add significantly to application response times, even under ideal workload partitioning. Those overheads can be eliminated and replaced with more modest network latencies that affect distributed transactions only, as long as you can partition the data as well as the workload into distinct distributed databases.
A parallel server architecture should only be adopted if scalability requirements demand it, and if such complete partitioning of both data and workload into a set of distributed databases is not feasible. The performance of a parallel server database will always be mediocre by comparison with an equivalent distributed or single-instance database. Parallel server is only superior in its scalability under vast workloads.
You must realize, however, that parallel server scalability is not automatic. Careful workload partitioning is essential. Workload partitioning is the key not only to reducing pings, but also to reducing the instance lock acquisition overheads of a parallel server database—in particular, inter-instance message passing.

5.3.10 Blocking Factor
There is one way of improving parallel server scalability that is not immediately obvious, but can result in significant savings both in pings and in lock acquisition messages.
In hashed locking, although each lock element covers multiple blocks, the default blocking factor is only one block. For multiblock reads, this means that a distinct PCM instance lock must be held for each consecutive block. However, if a blocking factor equal to the multiblock read count is adopted, then no more than one PCM instance lock will be acquired for each multiblock read. For globally visible data, this reduces PCM instance lock acquisition and thus inter-instance messages. This reduction in lock acquisition also reduces pings for data that is modified. This is because all the blocks covered by the lock that are cached in an incompatible state in a remote instance will be released in a single ping operation.
Clearly, a large blocking factor is desirable for data files that contain tables that are subject to multiblock reads, particularly if they are globally visible and subject to modification. But if a very large blocking factor is used, then a large number of buffers will be linked to individual lock elements at times, introducing a risk of contention for the lock element parent latches covering those lock elements. Also, if there are hot spots within the table, a large blocking factor increases the risk of false pings. Nevertheless, a blocking factor of several times the multiblock read count is normally appropriate for such data files. A generous blocking factor is also appropriate for rollback segments.
Indexes are much more problematic than tables and rollback segments— particularly globally visible indexes that are subject to modification. First, it is imperative that reverse key indexes be used to index monotonically increasing primary keys, lest considerable contention arise for the PCM instance lock covering the right-hand leaf block of the index. Oracle knows few forms of contention so debilitating as this slow motion game of ping-pong.
For the general case of globally visible and updated indexes, fine-grained locking is often suggested to combat the risk of false pings. Indeed, in my opinion, this is the only case in which fine-grained locking should be considered, and even then you should normally reject it in favor of heavy hashed locking.
I will concede that heavy hashed locking requires many more lock elements and instance lock resources and locks. But memory is cheap. Somewhat more telling is the complaint that heavy hashed locking, because of its retention of large numbers of instance locks, extends the period of reduced availability during instance lock redistribution when necessary. On the other hand, the database is frozen by default for the transaction recovery phase of instance recovery if any data files use fine-grained locking. This default can be changed with _FREEZE_DB_FOR_FAST_INSTANCE_RECOVERY if relatively few fine-grained locks are in use. But hashed locking is nevertheless to be preferred if lengthy transaction recovery may be required. However, the decisive argument in favor of heavy hashed locking is that the reduction in locking overhead from the retention of instance locks easily outweighs the performance impact of false pings in almost all cases.
Indexes that are subject to fast full scans also stand to benefit from an increased blocking factor. However, indexes are also more sensitive to false pings than tables, and so a more modest blocking factor is recommended.
In summary, my recommendation is that you use releasable hashed locking for all data files, with a heavier concentration of lock elements on globally visible and updated data.


5.4 Other Instance Locks
There are a number of other instance locks used for controlling certain operations in parallel server databases that have no counterpart in single-instance Oracle. For example, the SM (SMON) instance lock is used to ensure that the SMON processes of multiple instances cannot be simultaneously active. This is not necessary in single-instance Oracle. Similarly, the DR (distributed recovery) instance lock is used to ensure that only one RECO process can be active at any one time.
The DF (data file) instance locks are another group of locks that are not needed in single-instance Oracle. There is one DF resource for each data file, and each DBWn process holds a shared mode instance lock on each resource. If a data file is taken offline in one instance, then the remote DBWn processes are notified to no longer attempt to write to that data file by converting the mode of the instance lock on that resource.

5.5 Reference
This section contains a quick reference to the parameters, events, statistics, waits, and APT scripts mentioned in Chapter 5.
5.5.1 Parameters

_FREEZE_DB_FOR_FAST_INSTANCE_RECOVERY  Whether to freeze database activity during the transaction recovery phase of instance recovery. Defaults to TRUE if any data files use fine-grained locking. 
_GC_CLASS_LOCKS  The number of releasable lock elements to reserve for fine-grained locking of the minor class blocks. 
_GC_LATCHES  The number of lock element latches per partition of the lock elements fixed array. Defaults to two times the number of CPUs, which is ample. 
_GC_LCK_PROCS  The number of LCKn processes. Defaults to 1, which is normally sufficient. 
_IGNORE_FAILED_ESCALATES  Attempts to convert a PCM lock straight after a ping appears to fail because Oracle does not know which instance last modified the protected blocks. However, this merely means that the lock value block is invalid and cannot be used for SCN generation. The lock is usable in every other respect, and so the default setting of TRUE should be accepted. This parameter is not available in release 8.1. 
_KGL_MULTI_INSTANCE_INVALIDATION  This can be set to FALSE to disable global library cache invalidation locks. 
_KGL_MULTI_INSTANCE_LOCK  This can be set to FALSE to disable global library cache locks. 
_KGL_MULTI_INSTANCE_PIN  This can be set to FALSE to disable global library cache pins. 
_LM_DIRECT_SENDS  The processes that can send inter-instance messages directly. The 8.0 default of LKMGR means that all messages are sent via LMDn. The 8.1 default value is ALL. 
_LM_DLMD_PROCS  The number of LMDn processes. 
_LM_DOMAINS  The number of domain structures in 

the instance lock database. Domains are used for lock redistribution and recovery. Defaults to 2. 
_LM_GROUPS  The number of process group structures in the instance lock database. Defaults to 20. 
_LM_SEND_BUFFERS  The number of message buffers in the instance lock database. Defaults to 10000. 
_LM_XIDS  The number of transaction structures in the instance lock database. Defaults to 1.1 times the LM_PROCS value. 
_ROW_CACHE_BUFFER_SIZE  The size of the circular buffer in the PGA of the LCK0 process used for row cache instance locking messages. 
_ROW_CACHE_INSTANCE_LOCKS  The size of the row cache instance locks fixed array. 
_SAVE_ESCALATES  The default setting of TRUE enables early acquisition of more restrictive PCM instance locks than necessary. 
_UPCONVERT_FROM_AST  The default setting of TRUE enables the automatic reclamation of PCM instance lock modes lost due to pings. This parameter is not available in release 8.1. 
GC_DEFER_TIME  How long to defer response to ping requests. Defaults to 10 centiseconds. 
GC_FILES_TO_LOCKS  A string establishing the mapping of files to lock element buckets for hashed locking. 
GC_RELEASABLE_LOCKS  The number of releasable lock elements. 
GC_ROLLBACK_LOCKS  A string establishing the mapping of rollback segments to lock element buckets for hashed locking. 
LM_LOCKS  The number of lock structures in the instance lock database. Defaults to 

12000. 
LM_PROCS  The number of process structures in the instance lock database. The default is operating system specific. 
LM_RESS  The number of resource structures in the instance lock database. Defaults to 6000. 
PARALLEL_SERVER  Virtually no memory is allocated to all the instance lock structures unless this parameter is set to TRUE. 

5.5.2 Events
Event  Description 
10706  This is the trace event for instance lock operations. Level 1 lists the calls; level 5 includes the replies; and level 10 adds time stamps. Expect large trace files. 
29700  This event enables the collection of statistics in GV$DLM_CONVERT_LOCAL and GV$DLM_CONVERT_REMOTE. 

5.5.3 Statistics 5.5.4 Waits
Statistic  Source  Description 
cross instance CR read  GV$SYSSTAT  A block required for a query was held under an exclusive lock by another instance. After a ping request timed out, this instance made a cross-instance call for the block. This statistic no longer exists in release 8.1 due to the introduction of cache fusion. 
DBWR flush object cross instance calls  GV$SYSSTAT  Number of checkpoint object and invalidate object cross-instance calls. 
DBWR forced writes  GV$SYSSTAT  Total number of blocks written for forced writes. This statistic was named DBWR cross instance writes prior to release 8.1. 
global cache convert time  GV$SYSSTAT  PCM instance lock conversion time. 
global cache converts  GV$SYSSTAT  PCM instance lock conversions. 
global cache cr  GV$SYSSTAT  The total time for consistent read block requests to be 

block receive time  satisfied. 
global cache cr blocks read from disk  GV$SYSSTAT  Blocks read from disk for consistent reads because the block had already aged out of the cache of the remote instance holding an exclusive PCM instance lock covering that block. 
global cache cr blocks received  GV$SYSSTAT  Consistent read blocks received from remote instances via direct transfer. 
global cache defers  GV$SYSSTAT  The number of times a ping request was deferred. 
global cache freelist waits  GV$SYSSTAT  Waits to free a lock element for reuse. 
global cache get time  GV$SYSSTAT  PCM instance lock acquisition time. 
global cache gets  GV$SYSSTAT  PCM instance lock acquisitions. 
global cache queued converts  GV$SYSSTAT  PCM instance lock conversions that had to be queued, because another instance was holding the lock in an incompatible mode. 
global lock async converts  GV$SYSSTAT  Asynchronous instance lock conversions. 
global lock async gets  GV$SYSSTAT  Asynchronous instance lock acquisitions. 
global lock convert time  GV$SYSSTAT  Total instance lock conversion time. 
global lock get time  GV$SYSSTAT  Total instance lock acquisition time. 
global lock releases  GV$SYSSTAT  Instance lock releases. 
global lock sync converts  GV$SYSSTAT  Synchronous instance lock conversions. 
global lock sync gets  GV$SYSSTAT  Synchronous instance lock acquisitions. 
instance recovery database freeze count  GV$SYSSTAT  Global freezes for the transaction recovery phase of instance recovery. 
remote instance undo  GV$SYSSTAT  Forced writes of rollback segment data blocks. 

block writes 
remote instance undo header writes  GV$SYSSTAT  Forced writes of rollback segment header blocks. 
remote instance undo requests  GV$SYSSTAT  Rollback segment block write requests to remote instances needed while rolling back data blocks for consistent reads. 
dlm messages sent directly  GV$DLM_MISC  The number of lock management messages sent directly to the target instance by the process needing the lock. 
dlm messages flow controlled  GV$DLM_MISC  The number of lock management messages sent indirectly via the local LDMn processes. 
dlm messages received  GV$DLM_MISC  The number of lock management messages received by the local LDMn processes. 

Event  Description 
DFS lock handle  Waits to obtain a lock handle for an instance lock other than a PCM instance lock. 
global cache freelist wait  Waits to free a lock element for reuse. 
global cache lock busy  This wait occurs when a PCM instance lock operation cannot proceed because the previous operation on that lock element has not yet completed. 
global cache lock open sglobal cache lock open xglobal cache lock null to sglobal cache lock null to xglobal cache lock s to x  Acquiring a PCM instance lock or converting its mode upwards. 
global cache lock open ss  Acquiring a PCM instance lock on a minor class block in release 8.0. 

5.5.5 APT Scripts

 

from:internet

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值