[Back to normal view]

Q:

Periodically we encounter long delays waiting for "freeze locks." We have seen delays lasting from a few minutes to over an hour! What are "freeze locks," and why are they locking us up for so long?

 

A:

Freeze Locks are part of the recovery mechanism used by Rdb and DBMS to detect and recover for a failed process (a STOP/ID, CTRL-Y STOP, or a node failure in a multi-node environment). When a database process fails the Rdb or DBMS MONITOR process must start a recovery process (DBR) to rollback any uncommitted updates performed by the failed process. However, when a process fails in a VMS environment, it also releases all of its locks, including database locks. Without the "freeze lock" as a control mechanism, this could result in a "free-for-all" for access to these updated (but uncommitted) records.

The "control mechanism" used by Rdb and DBMS to prevent this "free-for-all" is to have the Rdb or DBMS monitor request the "freeze lock." All other database processes are notified of the monitor's need for the "freeze lock" and they stop all database activity until the "freeze lock" is released by the monitor process. The monitor process creates a Database Recovery process (DBR) and holds the freeze lock until the recovery processes completely recovers (rolls back) the uncommitted transaction of the failed process. The freeze lock is held until the recovery process complete.

In your environment, some processes have access to a large number of database buffers. The number of buffers that a process uses is determined by the NUMBER OF BUFFERS ARE <number> in the SQL ALTER DATABASE command (for Rdb) or the DBO/CREATE (/MODIFY) qualifer "/BUFFERS" (for DBMS); or the value of the logical name, RDM$BIND_BUFFERS and DBM$BIND_BUFFERS and (if defined). When this process failed, the database recovery process used only 20 buffers (the default) available. The default number of buffers used by the DBR process was insufficient to manage the large number of updates made by the failed process and thus caused excessive I/O during recovery.

To solve this problem, we modified your database to significantly increase the number of buffers available to the recovery process. This was done using the SQL statement "Alter DATABASE...NUMBER OF BUFFERS IS n. In general, the DBR process should have more buffers than the user that is recovering for.

In addition, the use of FAST COMMIT will also cause longer recovery times. With FAST COMMIT enabled, Rdb and DBMS can defer writing committed data to the storage area files by "committing" them only to the AIJ file. However, in the event of a process or system failure, the recovery process must perform a "REDO" of the committed transactions – resulting in longer freeze times.

 

Copyright © 2008 Software Concepts International
All Rights Reserved