When a database process fails the
Rdb or DBMS MONITOR
process must start a recovery process (DBR) to rollback any
uncommitted updates performed by the failed process. However, when a
process fails in a VMS environment, it also releases all of its
locks, including database locks. Without the "freeze lock"
as a control
mechanism, this could result in a "free-for-all" for
access to these updated (but uncommitted) records.
The "control mechanism" used by Rdb and DBMS to
prevent this "free-for-all" is to have the Rdb or DBMS monitor
request the "freeze lock." All other database processes
are notified of the monitor's need for the "freeze lock"
and they stop all database activity until the "freeze
lock" is released by the monitor process. The monitor process
creates a Database Recovery process (DBR) and holds the freeze lock
until the recovery processes completely recovers (rolls back) the
uncommitted transaction of the failed process. The freeze lock
is held until the recovery process complete.
In your environment, some processes have access to
a large number of database buffers. The number of buffers that a
process uses is determined by the NUMBER OF BUFFERS ARE
<number> in the SQL ALTER DATABASE command (for Rdb) or the DBO/CREATE (/MODIFY) qualifer
"/BUFFERS" (for DBMS); or the value of the logical name,
RDM$BIND_BUFFERS and DBM$BIND_BUFFERS and (if defined). When this process failed, the
database recovery process used only 20 buffers (the default)
available. The default number of buffers used by the DBR process was
insufficient to manage the large number of updates made by the
failed process – and thus caused excessive I/O during recovery.
To solve this problem, we modified your database
to significantly increase the number of buffers available to the
recovery process. This was done using the SQL statement "Alter
DATABASE...NUMBER OF BUFFERS IS n. In general, the DBR process should have more buffers
than the user that is recovering for.
In
addition, the use of FAST COMMIT will also cause longer recovery
times. With FAST COMMIT enabled, Rdb and DBMS can defer writing
committed data to the storage area files by "committing"
them only to the AIJ file. However, in the event of a process or
system failure, the recovery process must perform a "REDO"
of the committed transactions – resulting in longer freeze times.