|
What's New in Oracle CODASYL DBMS Version 6.0 (DBMS only)
DEC DBMS 6.0 offers significantly greater performance, easier management and more features. In this article, we will review many of the new features in 6.0 and discuss how you can benefit from these enhancements. DBMS 6.0 is more than just an incremental enhancement over v5.1.
I/O EnhancementsIn a database environment, I/O activity causes most bottlenecks. While disk subsystems are seeing performance improvements of approximately 10% per year, CPU performance has been doubling about every 1-1/2 years. Without significant enhancements to database I/O subsystems, performance of database applications is limited to the performance improvements of the disk system.
Digital's Database Engineering Group has made significant improvements to the I/O subsystem of DBMS 6.0 to minimize the stall time of database processes. In many cases, DBMS continues processing while asynchronous I/Os complete-minimizing I/O stall time.
Asynchronous Batch Writes Disk I/Os are slow and can cause database processes to stall while they wait for I/O operations to complete. One approach to minimizing I/O stall time is to reduce I/Os – this is accomplished in DBMS using local or global buffers, fast commits and the new AIJ Logger Server (ALS). Another approach to minimizing I/O stall time is to continue processing while previous I/Os complete asynchronously.
Before version 4.3 of DBMS, each modified page was written to storage area files synchronously (one at a time). In version 4.3 of DBMS, all modified pages were sorted by page number and written back to the database synchronously. This allowed the batched writes to be optimized by I/O subsystems and for I/Os to occur in parallel on multiple devices. However, even with this implementation, database processes waited for all writes to complete. Thus, a database process would stall until the I/Os completed, regardless of whether it needed any of the pages for further activity.
The new Asynchronous Batch Write (ABW) feature of DBMS 6.0 eliminates this bottleneck by allowing the database process to continue unless events require "catching up" with the write in progress for a page. Thus in many environments, the database server process will continue processing additional data without waiting for prior writes to complete.
In version 6.0, ABW is enabled by default. It can be disabled by defining a logical name, but most sites will want to leave it enabled as they are likely to benefit from its use. It is also possible to tune the behavior of ABW with additional logical names. Most DBMS utilities that write to the database (such as recovery and load) will also see performance improvements as a result of this enhancement.
Asynchronous Pre-Fetch While Asynchronous Batch Writes minimize the need to wait for database writes, Asynchronous Pre-Fetch (APF) attempts to predict what page(s) will be needed in the future and reads them into your buffer pool before you need them. If successful in this predictive behavior, then APF eliminates read stall time. APF is most likely to benefit applications that perform substantial sequential processing of database areas. Many batch jobs that process all of a given record type come to mind.
Optimized Page Transfers Once a page has been read into memory, we are faced with how to avoid duplicate reads, and how to prevent forced writes to disk. A forced write occurs when a modified page in the buffer pool is "forced" to be written back to disk so that the current version of that page may be read by another process. Global buffers answer the problem of duplicate reads. If one process reads a page into the global buffer pool, other processes on the same node can access the page without incurring an I/O.
Before version 6.0, a "forced write" occurred when a modified page in the buffer pool was requested by another process. With Optimized Page Transfers (OPT), multiple processes on a single node can share and update pages without flushing the pages to disk. Thus, pages in the global buffer pool can contain committed updates from another process – without these updates being reflected on disk. In version 6.0, Optimized Page Transfers are automatically enabled if all the following conditions are met:
Optimized Page Transfers can significantly enhance the performance of update intensive multi-user applications where multiple processes share the same data (or data that is on the same page).
Main Memory DBs "Main Memory Database" support is not itself a "feature" of version 6.0. However, the combination of other features such as Global Buffers, Optimized Page Transfers, Fast Commits, AIJ Log Server and hardware considerations (such as increases in memory support, reduction in memory prices and rapid growth in CPU power) make main-memory databases a possibility in some environments. With OPT, and sufficient memory, it is conceivable that your entire database could be cached in main memory with the only I/O being to the AIJ and periodic writes to the storage area files during checkpointing.
Enhanced After-Image Journaling (AIJ)Many of the I/O enhancements to DBMS defer writes to the database storage area files – with the updates journaled to the After Image Journal (AIJ) file. This allows applications to achieve higher throughput (usually measured in Transactions Per Second (TPS)). The higher TPS rates now possible place a greater burden on AIJ processing, and in some sites this may become the bottleneck. Significant enhancements have been made in 6.0 to improve the performance and management of After Image Journals.
Circular AIJ Journaling In prior versions, a database could have at most one AIJ file. When the AIJ file became full, it would extend. If the AIJ device is full, the extend fails and database activity is halted. While DBMS 6.0 still supports single "extensible" AIJs (the default), most sites will benefit from the features of circular AIJs. With Circular AIJ Journaling, a database has multiple, fixed-length AIJ files. While DBMS writes to only one AIJ file at a time, it will automatically switch to the next accessible AIJ file when one AIJ file becomes full or inaccessible. By spreading the circular journal files across multiple disks, you benefit from automatic fail over if the "current" AIJ disk becomes unavailable.
To use circular AIJs, you must first reserve slots in the root file for AIJ journal information – an off-line activity. This can be done either during database creation or at a later point using DBO/MODIFY. Once you reserve the AIJ slots, you can add or remove AIJ journal files to the database on-line without interrupting database activity. AIJ backups also are significantly faster with circular AIJs, since you backup AIJ files that are not currently in use.
AIJ Logging Server While Digital's database engineers have significantly enhanced DBMS to reduce I/O to the storage areas, writes to the AIJ file are still required at commit time. In high transaction systems, with a large number of processes, the overhead of synchronizing AIJ access may become a bottleneck.
The optional AIJ Log Server (ALS) process is a detached process that performs all writes to the After Image Journal (AIJ) file for a database on a node. By enabling ALS, you eliminate the lock bottlenecks that can occur when multiple database processes are each writing to the AIJ file.
When enabled, there is one ALS process per database per node. Database processes post their updates to a shared global section, and the ALS process writes the updates to disk. The ALS process can be started manually through the new DBO command SERVER or the database can be configured to automatically start the ALS when the database is opened.
Electronic AIJ Cache With
ALS enabled, one magnetic disk can handle thousands of TPS (transactions
per second). Those with extremely response-time-critical high-end
applications can further reduce AIJ stall time by using solid state disks
(SSDs) and the new AIJ Cache on Electronic disk (ACE) feature to further
reduce commit stalls. With ACE enabled, a new file type (.ACE) is created on the SSD. This ACE file is used as a temporary cache for AIJ writes, and requires only 64 blocks per node. To use ACE, you must enable ALS for the database. With ACE enabled, commit stall time is reduced to only 1 millisecond! In fact, the actual writing to the AIJ file happens in parallel with database activity.
AIJ Backup Server The optional AIJ Backup Server (ABS) is a process created by the DBMS monitor. The monitor creates the ABS process when DBMS switches to the next AIJ file. When enabled, the ABS automatically performs a backup of the prior AIJ file to an on-disk backup file. The ABS process automatically stops when the AIJ backup completes.
By using the ABS, your AIJ files are automatically backed up to disk without interfering with database operations. This eliminates the need for your operations staff to perform special backup operations for the AIJ files. Unfortunately, the backup AIJ files created by the ABS cannot be optimized using DBO/OPTIMIZE (similar to no-quiet-point AIJ backups).
LockingTwo-Phase Page Locking Before version 6.0, page locking was used for mutual exclusion, and record locking was used for ensuring logical "correctness." In some environments, this results in excessive locking since both a page and a record lock is used to manage the same resource (a record). Page locking can be enabled on a per-area basis. When page locking is enabled, DBMS uses page locks for both mutual exclusion and logical correctness – eliminating the need for the overhead of record-level locking. Transactions are serialized by page access, rather than record access. This may also reduce I/O since the pages will not be written to disk as the result of a request for a common page. When enabled, page locks are held until the end of the transaction. Thus, this may result in higher contention at the page level and more deadlocks. Page level locking works best when transactions are kept short, and there are sufficient buffers to hold the pages updated during the transaction. The DBO/SHOW statistics, Record Lock Statistics (Blocking ASTs and stall times) can provide some insight into whether your environment will benefit from page locking.
Partitioned Lock Trees Prior to version 6.0, a single lock tree was created for each database. In a clustered environment, this means that when the database locks were remastered across the cluster, all locks for the database were moved. This impacted the ability to scale database performance with additional cluster nodes. In version 6.0, the lock tree is partitioned by database root structures and areas. This provides better cluster scalability, better work-load balancing across a cluster and greater local mastering of locks. Single-node environments will see no benefit from this enhancement.
Recoverable Latches The VMS Distributed Lock Manager (DLM) provides a robust set of features, and provides for management of resources across the cluster. The DLM is a highly optimized routine, but the capabilities of the DLM come at a cost – they are relatively slow. Where cluster-wide resource management is required, the DLM is still the best tool. However, not all DBMS locks require cluster-wide servicing. Those locks that are node specific, are better served by using a more efficient locking protocol – recoverable latches. One such node-specific resource that must be managed is the Global Buffer Page Table (GBPT) locks. The GBPT is accessed when a page is added to or removed from a process' allocate set and when a page is read from disk into the global buffer. Global buffers offer reduced disk I/O and better memory utilization over local buffering. However, this came at the cost of significantly more synchronization overhead for managing the GBPT slot locks. Using recoverable latches for managing the GBPT drastic-ally reduces the code path required for global buffer synchronization. With version 6.0, you achieve the benefits of global buffers without the added overhead of GBPT synchronization.
Backup/RecoveryDBMS backup and recovery processes will benefit from many of the I/O improvements made to the version 6.0 I/O subsystem. Along with these benefits, new features have been added to the backup and recovery systems that improves database availability.
Fast Incremental Backup Incremental backups need to be fast. Before version 6.0, incremental backups scanned the entire database looking for pages modified since the last full backup. As database sizes increase, the percentage of pages modified decreases, while the time required to perform incremental backups increases. One solution to this problem is not to scan every page, but to scan only those pages that are likely to have changed. To determine which pages are "likely" to have been changed, DBMS updates the SPAM pages with a time and date stamp and the transaction sequence number (TSN) of the first update that occurs within the SPAM interval after a full backup. To implement the fast incremental backups, DBMS now scans the SPAM pages. Pages within each SPAM interval are only scanned for updated pages if the SPAM page indicates that pages within its interval have been modified. No changes were necessary in the backup command syntax to implement this change. If you are not using incremental backups because they "take too long," now is a good time to try them again!
Recovery by Page DBMS is a very solid product, and corruption is rare. However, hardware, software or operational errors are possible and can result in data corruption. Traditional recovery procedures become unacceptable as databases become larger (restoring a 20 gigabyte database may have a significant impact on your operation). If only a few pages in your database are corrupted, wouldn't it be great if you could recover only those pages, without having to recover the entire database Well, in version 6.0, you can! DBMS tracks corrupted pages in a new root file structure, the Corrupt Page Table (CPT). DBMS automatically logs corrupted pages into the CPT when it detects a checksum error during reference. Pages in the CPT may be in one of three states:
Corrupt pages are pages with checksum errors. Inconsistent pages are pages that have been restored but need to be recovered from the AIJ to make them consistent. Consistent pages are pages that have been restored and whose transaction level makes it consistent with its area. To implement this functionality, a new qualifier, "/JUST_PAGES", has been added to both the RESTORE and RECOVER commands. When this qualifier is present, DBO will restore or recover only those pages listed as "corrupt" or "inconsistent" in the CPT.
DBO/SHOW STATISTICSThe DBO/SHOW statistics utility has always been a model for what should be included in a database performance monitor. DBMS 6.0 improves on this, setting yet a higher standard.
New Page Header The 6.0 statistics utility displays a common header on all statistics screens. The new page headers include useful information such as the length of the collection sample (how long has the database been opened), the sample rate (set_rate), the current page number within the screen (e.g., page 1 of 5) and the input mode (on-line, record or replay).
Lock Deadlocks and Timeouts Two new screens have been added to the statistics utility to display "Lock Timeouts" and "Lock Deadlock" histories. These screens are similar to the "Stall Messages" screens of prior versions of DBMS, except that the last occurrence of a lock timeout or deadlock is displayed for each active process on the current node. Both screens display the process id of the process that encountered the lock conflict, the resource that was involved in the lock conflict and the number of lock conflicts (deadlocks or timeouts) for that process. This information is useful for identifying resource bottle-necks in your application or database. While displaying the number of lock conflicts per process is useful, a more useful statistic would be the number of times a resource was involved in a lock conflict.
"Stall Message" Improvements Prior to version 6.0, the "Stall Messages" screen displayed all stalls, regardless of duration. In high-volume OLTP environments (those with a large number of I/Os on high speed devices), it may be difficult to differentiate "important" stall messages from "noise" stall messages caused by millisecond I/O stalls. The "Stall Messages" screen has been enhanced to include an "Alarm" option. This allows the DBA to filter out unimportant stalls by removing stall messages less than a specified duration.
Reload UtilityThe Reload Utility, a new function of the Database Restructuring Utility (DRU), allows database users to shuffle records in an area to relocate records from non-target pages to their target pages and update the relevant data structures including the meta-data, if necessary.
Conversion issuesPrior to upgrading to DBMS 6.0, you should perform a full DBO/BACKUP on all databases. Convert all databases to version 6.0 using the DBO/CONVERT command. There is no need to convert your database backup files from prior versions, since databases are automatically converted to version 6.0 when restored from a database backup file. The CONVERT command restructures the contents of the database root file to make them consistent with version 6.0. As a result, the space requirements of the root file are likely to increase (significantly, if you are converting from v4.3 or earlier). If you intend on using After Image Journaling (AIJs), you should perform a full DBO/BACKUP after converting your databases to 6.0. Without this post-conversion backup, you will not be able to recover transactions from your AIJ.
*Note: Oracle purchased the Rdb family of products, including DBMS, from Digital in 1995.
|