VAX DBMS Ported to Alpha Architecture

By Bryan Holland

Alpha is DEC's new 64-bit RISC based architecture that will run Open-VMS. Alpha is the future -- and VAX DBMS is there. Engineers within Digital's Database Systems Group have successfully ported DBMS to the ALPHA platform. Porting DBMS to the Alpha architecture shows Digital's commitment to its customers and to the future of DBMS.

Digital took the opportunity to show off DBMS running on the Alpha platform during the recent DECUS and DECWorld events.

The developers ported DBMS v4.3 to the Alpha platform. All features, program and user interfaces and on-disk structures remain unchanged from the VAX version of the product. DBMS is DBMS, regardless of what platform it is running on.

Since the on-disk structures are consistent across platforms, it will be possible to access the same VAX DBMS database from either the VAX platform or the Alpha platform.

The Porting Effort

While "VMS is VMS", porting DBMS to the Alpha platform was by no means a "recompile and run" effort. Many portions of DBMS run in elevated processor modes and make use of several low-level VMS interfaces, thus requiring modification to run in the Alpha environment.

All features, program and user interfaces and on-disk structures remain unchanged from the VAX version of the product.

One of the primary goals of the porting effort was to maintain a common code-base (source code) across platforms. To do this, it was necessary to eliminate dependencies on certain VAX architectural features -- requiring a significant amount of code to be rewritten. According to Digital engineers, over 99 percent of the DBMS code is now common to both platforms.

Porting Your DBMS Applications to Alpha

According to engineers in the Database System Group, porting customer DBMS applications from the VAX to Alpha will generally require only a recompile. Customer code generally does not make use of low-level VMS interfaces or operate in elevated processor modes. Thus, most customer ports will be very straight forward. However, to get the full benefits of the Alpha architecture, customers will have to naturally align their data structures.

Data Alignment Issues

To get the best performance from the Alpha architecture, customers will want to naturally align the data structures in their programs. Naturally aligned data structures will yield performance improvements on both the VAX and Alpha architecture's. While naturally aligning data structures in the VAX environment will improve performance, doing so in the Alpha environment significantly improves performance.

Naturally Aligned Data

The port of DBMS to the Alpha platform is proof of Digital's strong ongoing commitment to DBMS and their customers.

Naturally aligned data, is data whose lowest addressed byte resides at an address that is an even multiple of the size of the item in bytes.

Many of the Alpha compilers will handle data alignment issues, or issue warnings for unaligned data. Moving customer applications to the Alpha platform will generally be a straight forward path.

Translating VAX Images to Alpha with VEST

Digital is offering customers two migration paths to the Alpha platform from the VAX. The first, and usually the most straight forward, is the "recompile and run" port. However, for customers that do not have access to their application's source code, the VEST (Vax Environment Software Translator) may be able to translate their VAX images into Alpha images. Digital has been using the VEST translator for over a year internally. VEST will convert your VAX executables into Alpha executables. VEST is essentially a compiler that uses VAX/VMS images as its source code. There are two issues surrounding VEST translated images that customers must be aware of:

  1. Translated images will execute on the Alpha platform at roughly VAX speed -- They will not be able to take advantage of the Alpha architecture.
  2. The ability to call privileged sharable images (such as the DBCS) is questionable for VEST translated images. The jury is still out on whether VEST will provide a valid migration path for porting your VAX DBMS applications to Alpha.

In general, customers will want to "recompile and run" their applications to gain the greatest benefit from the new Alpha architecture. However, naturally aligning data will yield significant performance improvements in the Alpha environment.

The port of DBMS to the Alpha platform is proof of Digital's strong ongoing commitment to DBMS and their customers. Alpha is Digital's architecture to move them into the next century, and DBMS will be there to support your high-performance application requirements!

 



VAX DBMS Hints, Kinks
and Work-Arounds

By SCI Technical Staff

 

DBMS is a robust product with many features and parameters. Which features should be implemented, and how they should be implemented, varies from site to site. Determining the optimal configuration for your environment is not a trivial task!

This section will be reserved in future issues of DBAdvisor for your technical DBMS questions. SCI staff will provide answers to your questions, and discuss alternative solutions.

SPAM Errors reported by DBO/VERIFY/ONLINE

Q: When running DBO/VERIFY/ONLINE, we receive the following error messages:

"DBO-W-PGSPAMENT, area <str>, page <int>; the <int>% fullness value for this data page does not fall within the <int>-<int>% range found on the space mgmt. page".

However, if we run DBO/VERIFY standalone on the same database the error messages do not appear. What are we doing wrong?

A: The problem you are experiencing, is a timing problem. The DBO/VERIFY/ONLINE operation uses batch retrieval (snapshot files) to ensure a consistent view of the database during the online verify. However, the snapshot files contain only storage-area line entries (records), not the SPAM pages or SPAM entries for each page.

Since the SPAM entries are updated as records are being added, deleted, or modified during the online verification operation, the updated SPAM values are not being written to the snapshot files.

Although there are no perfect solutions to this problem, there are a couple of ways that you can eliminate or minimize the impact:

  1. Run DBO/VERIFY on a copy of the database restored from the most recent backup. This solution has the advantage that it not only verifies the internal integrity of your database, but it also validates the integrity of your DBO/BACKUP and media. The obvious down-side to this solution is the time and space required to implement it.
  2. To minimize the impact of this problem in your environment, you should schedule your DBO/VERIFY/ONLINE to run during low update periods. This makes sense for all /ONLINE operations since it will reduce the overhead of the /ONLINE process.

File Access Errors

Q: When accessing the database for CONCURRENT RETRIEVAL, we sometimes receive the following fatal error:

DBM-F-FILACCERR, error opening storage area file <str>; -SYSTEM-W-ACCONFLICT, file access conflict.

There were no other processes accessing this database when the error occurs. This problem seems to occur more frequently during the evening hours.

A: On the basis of our analysis, the problem you are experiencing is a conflict with the VMS system BACKUP. This would explain why you are seeing the problem more frequently during the evening (when your backups are scheduled).

First, you should make absolutely certain that you are not using VMS backup to backup your DBMS database (see Backup article in this issue). Use the DBO/BACKUP utility to backup your DBMS database. Next, you should ask your system manager to exclude all database backups from the VMS system backup (/EXCLUDE= *.DBS,*.SNP,*.ROO). In addition, SCI recommends that customers use the DCL COMMAND SET FILE/NOBACKUP to ensure that DBMS files are not accidentally backed (and restored) by the system backup and restore procedures. If you attempt to use VMS to restore a file that was marked "/NOBACKUP", the error will be obvious -- an empty file will be restored (VMS will backup the header of files marked with /NOBACKUP). Always use the DBO/BACKUP and DBO/RESTORE routines to backup and restore your DBMS databases!

Extraneous AREABUSY exceptions

Q: Periodically, we receive the error message, "DBM-F-AREABUSY, usage of storage area <str> conflicts with a co-user", when we attempt to ready our database for CONCURRENT RETRIEVAL -- An extensive review of our code found no cases where any of the areas were readied in an incompatible mode.

A: VAX DBMS can return the DBM-F-AREABUSY message on a READY statement even though compatible ready modes (CONCURRENT RETRIEVAL and CONCURRENT UPDATE) are being used. This situation can occur when an attempt is made to ready an area while database recovery is in progress. During database recovery, all database activity is frozen until the DBR recovery process has completed. When this situation arises, VAX DBMS will not wait for the freeze to complete if the lock request is a NOWAIT lock request. On READY statements, NOWAIT is the default for area locks requests.

There are a couple of ways to handle this situation:

1. Ready the database with WAIT (READY CONCURRENT RETRIEVAL WAIT). This will make the process issuing the ready wait for an area lock to be granted until after a database freeze has completed.

2. Modify your programs to trap for this exception, wait a few seconds and then try to READY the area again.

 

Send Your VAX DBMS related Questions to:

DBAdvisor
c/o Software Concepts International

Nashua, NH 03063, USA
Phone: (603) 879-9022
Internet Address:
sciinc@sciinc.com

Answers will be published in
VAX DBMS Hints, Kinks and Workarounds in the next issue of the DBAdvisor

 


 

QuickDisconnect

 

QuickDisconnect Enables You to Reorganize Portions of Your Database Easily and Efficiently by Removing Cross-Area-Pointers!

Software Concepts International announced plans to develop QuickDisconnect. According to Bryan Holland, SCI owner, QuickDisconnect will offer significant performance improvements over traditional 3GL/DML (Data Manipulation Language) approaches, when disconnecting all records from a set.

This is a perfect solution for any VAX DBMS site that needs to reorganize only a portion of their database. QuickDisconnect efficiently removes all cross-area-pointers broken by an initialize after you have unloaded the data in the "problem area".

Cross-Area-Pointers

Cross-Area-Pointers (CAPs) connect records in one database storage area with records in a different storage area. Most database designs include the use of CAPs, since they offer greater flexibility and significant performance benefits over single area database designs. However, if you initialize a storage area with CAPs, the set pointers in the remaining area no longer point to a valid DBMS record -- your database is corrupt.

The Reload Dilemma

If you ever tried to reorganize a single area in your database, you have no doubt been frustrated with having to deal with cross-area-pointers. If the schema definition permits, you can use the DML verb, DISCONNECT, to remove these set relationships before initializing the "problem area" . However, if the set is defined as FIXED or MANDATORY, you are forced to reload and erase additional records that are not stored within the "problem area" -- possibly forcing you to reload the entire database.

 

Traditional Methods Are Inefficient

Using a traditional 3GL/DML approach to this problem is grossly inefficient. A DML DISCONNECT or ERASE verb must update at least three records for every disconnect performed...and every member occurrence must be disconnected before you can initialize the problem area! If you consider that each disconnect may result in multiple I/Os to the database, you can see why the DML DISCONNECT or ERASE option is very expensive.

 

The QuickDisconnect Solution

QuickDisconnect is a specialized program, designed to efficiently remove all record occurrences from a set. QuickDisconnect does this by directly removing the pointer clusters and index nodes of the set-type from all appropriate records in the database in a single pass through the area. Database pages are read only once and the affected records are updated only once.

QuickDisconnect further improves performance by reading large portions of the database into its buffers with each I/O, thus reducing the number of I/Os to an absolute minimum.

Disconnects Fixed and Mandatory Sets!

QuickDisconnect disconnects sets by directly removing the pointer clusters of the appropriate set-types -- regardless of SCHEMA retention modes. Thus, even Fixed or Mandatory sets can be disconnected with QuickDisconnect! After unloading the data in the problem area and initializing the area the CAPS are efficiently removed by QuickDisconnect. The set-connections are reestablished using the DML CONNECT verb -- even for set-types defined with an AUTOMATIC insertion mode and a FIXED or MANDATORY retention mode.

Improves Availability

QuickDisconnect allows you to partition your database reorganizations into more manageable pieces. Being able to reorganize individual areas of your database means less time reorganizing your database...and that improves the availability of your database!

QuickDisconnect is the ideal solution for any situation that requires all members of a set to be disconnected from their owners.

 

 

Licensing

QuickDisconnect will be licensed on a per-node basis. Priced at only $9,995 for the distribution license, and $4,995 for the right-to-copy license regardless of node size.

Limited time, Special Offer!

SCI is offering a 25 percent discount for customers who sign predevelopment POs. This offer provides customers with significant savings with no risk.

*QuickDisconnect is a trademark of
Software Concepts International

*VAX DBMS is a trademark of
Digital Equipment Corporation

Call Software Concepts International at (603) 879-9022 to order QuickDisconnect or ReloadExpert!

FOR MORE INFORMATION

New England
Software Concepts

Nashua, NH 03063
(603) 879-9022

 


 

Implementing SPAMS for Performance

 

By SCI Technical Staff

Space Area Management (SPAM) is a technique used by VAX DBMS to improve the storage performance of nearly full storage areas. The objective of SPAM technology is to allow the DBCS (Database Control System) to quickly locate a page with sufficient free space to store new records in your database. SPAM pages maintain an inventory of available space on database storage area pages. They enable databases with SPAM pages to spend less time searching for free in your database during STORE operations. Without SPAM technology, the DBCS would have to search each page in the storage area until it finds a page with sufficient free space to store the new record.

This article examines the inner workings of SPAM technology. By understanding the purpose of SPAMs and how they work, you can make better decisions when implementing SPAMs in your database design. You will also learn why decreasing SPAM thresholds to a tiny number in versions prior to V4.3 didn't result in a decreased level of record fragmentation.

The optimal SPAM values for your database storage areas depend on the size of the records stored in the area, the desired record placement strategy, the level of database contention in your database, as well as the "typical" order in which records are stored on a database page. For example, storing a large record followed by several smaller records requires different strategy than if the small records are stored first, followed by the large records.

SPAM pages maintain a bit-mapped inventory of the free space available on the storage pages in your database. Two (2) bits are used to track the free space of each storage page in your database.

SPAM
Bits
DecimalThreshold Default
0 00 00
0 11 170
1 02 285
1 13 395

 

Without SPAM technology, the DBCS would have to search each page in the storage area until it finds a page with sufficient free space to store the new record.

Thus, the free space on each database page is represented by a value between zero (0) and three (3), inclusive. These values correspond to the threshold values you set when you create the database (or when you restore the database using DBO/RESTORE/THRESHOLD). The SPAM entry for a page is updated whenever the amount of data on that page crosses a threshold boundary.

It is important to note, that while the DBCS uses the SPAM pages to locate pages with sufficient free space, the threshold values of the SPAM pages are based on used space.

 

The SPAM algorithm
(pre v4.3):

The DBCS first checks the target page for free space, regardless of SPAM thresholds. The target page is the page in the database that the DBCS wants to store the new record based on the PLACEMENT clause in the storage schema. If there is not enough space available on the target page for the new record, then the DBCS checks the other pages in the current database buffer for a page with sufficient free space to store the new record. The current database buffer is the buffer that contains the target page (as well as additional database pages based on the buffer length). The DBCS only checks the SPAM pages to look for free space when none of the pages in the current buffer contain sufficient free space to store the new record. Thus, with versions of DBMS prior to version 4.3, it is possible to exceed the third threshold (T3) when storing new records. That is, there was no way to define the maximum fullness of the database pages for storing new records. This could cause excessive fullness of the database pages, resulting in severe record fragmentation when the DBCS adds set pointer clusters to records during a CONNECT operation...regardless of SPAM values!

 

The SPAM algorithm
(v4.3):

The SPAM algorithm has been enhanced slightly for version 4.3 of DBMS. While earlier releases of DBMS would store a new record on any page in the buffer containing the target page regardless of SPAM values, version 4.3 will not store a record on a page that is already at the T3 threshold.

Sites that calculated SPAM thresholds based on the old (pre v4.3) algorithm should adjust their SPAM thresholds to reflect the new SPAM algorithm introduced with version 4.3.

While this change offers some improvement over the prior SPAM algorithm, what database designers really need is the ability to specify how much free space to leave on a page. Thus, by knowing how many records will be stored on a page and how much each record is likely to grow (due to set insertions and dynamic data items), it would be possible to reserve this space on the page for future record expansion.

 

SPAM parameters

There are two database parameters that affect Space Management; the threshold values, and the interval. Selecting optimal values for SPAM parameters can dramatically improve database performance -- Using the default values are guaranteed to be a poor choice.

The threshold values determine the percent fullness at which the SPAM entries are updated, while the interval determines how many database storage pages are managed by each SPAM page.

 

SPAM Interval

Selecting optimal values for SPAM parameters can dramatically improve
database performance
Using the default
values are guaranteed to be a poor choice.

The /INTERVAL qualifier determines how many data pages are managed by each SPAM page in a database storage area.

 Interval  Interval
S d d d ... d d d S d d d ... d d d
P a a a ... a a a P a a a ... a a a
A t t t ... t t t A t t t ... t t t
M a a a... a a a M a a a ... a a a
SPAM Interval Diagram                

Increasing the SPAM interval may reduce I/O, but may cause additional contention for SPAM pages in a high contention update environment.

The default interval for SPAM pages is 256. This means that, by default, 256 database storage pages are managed by each SPAM page. This is also the minimum value that you can specify for the SPAM interval. The maximum SPAM interval value is determined by the following equation:

MAXIMUM_SPAM_INTERVAL = ((blocks-per-page * 512) - 22) * 4

The value of 512 is the number of bytes per block. The value of 22 is the number of bytes in the SPAM page header, and the value of 4 is the number of SPAM entries (two bits) per byte (eight bits).

In general, the larger your storage area, the larger the SPAM interval value should be. This methodology optimizes SPAM effectiveness by reducing the free space search time, while balancing the likelihood of SPAM page conflict. Low contention environments should use the maximum interval value, while high-contention update environments should lower their interval values.

The Optimal SPAM Values can significantly improve database performance! Determining what the "optimal" values are is a difficult task!

Software Concepts International staff is the recognized authority in VAX DBMS technical issues.

Call (603) 879-9022 for
VAX DBMS Technical Consulting!

The trade-off when setting SPAM intervals is data page I/O (searching for free space in which to store new records) versus SPAM page conflicts (which results in additional synchronization and I/O to the SPAM page). DBMS provides statistics to help you measure the performance impact of both sides of the SPAM interval equation.

Interval Statistics

The best measure of SPAM page contention can be seen on the PIO statistics page of DBO STATISTICS. The statistic, "unmark buffer -- SPAM page", measures how often SPAM pages are being written back to the database (SPAM page write I/Os). However, this statistic is not a direct measure of SPAM page contention, since SPAM pages are written back to the database due COMMIT/ROLLBACK, buffer overflows as well as contention.

PIO Statistics

To make sense of the "unmark buffer -- SPAM page" statistic, compare this statistic with the other statistics in the "unmark buffer" group.

If the "unmark buffer -- SPAM page" statistic is significantly higher than the "unmark buffer -- transaction", this is good indication of SPAM page conflicts.

Another measure of SPAM page contention can be found by comparing the "unmark buffer -- SPAM page" with the "unmark buffer -- lock conflict" statistic. However, use caution when making this comparison -- the "unmark-buffer -- lock conflict" statistic includes all page buffers -- not just the SPAM pages.

SPAM page contention also shows up in the page locking statistics. Increased SPAM contention would cause an increase in blocking ASTs for page locks. However, since the page locks also manage the entire buffer pool, it is difficult (if not impossible) to determine the severity or cost of SPAM page conflicts from this statistic.

SPAM Thresholds

You can specify up to three threshold values for the /THRESHOLD qualifier. It is possible to set two or more of the threshold values to the same number, thus effectively eliminating one of the threshold levels. If you have only one or two record-types of different lengths in your storage area, use only one or two threshold values to manage space in that storage area. Reducing the number of thresholds may reduce the frequency in which the SPAM pages are updated, thus reducing SPAM page conflicts. By default, DBMS sets three thresholds, with the values 70, 85, and 95 percent.

SPAM threshold values are based on the percentage of used space on a database page. However, to effectively plan your threshold values, you must think of threshold values in terms of how much free space is guaranteed to exist on a database page for each threshold value. The following table helps to illustrate this concept:


Threshold
SPAM
Value
Guaranteed Percent Free
00 30%
170 15%
285 5%
395 0%

Thus, if you attempt to store a record that occupies 65% of the database page, the DBCS will check every page that is at the T0 threshold. The DBCS will not waste time checking those pages whose threshold values exceed the T0 level, since they do not guarantee sufficient free space for the new record. Note, that in this example, it is still possible that the DBCS will physically examine several database pages since the lowest threshold value (70) guarantees only 30% free space, while 65 percent is required to store this new record. Thus, if most or all of the pages in the storage area are above 35% fullness, the DBCS will yield little or no benefit from the SPAM thresholds. An approximation of the efficiency of this threshold value can be measured by using the following equation:

SPAM_efficiency =

SPAM_threshold - (100 - pct_needed)
SPAM_threshold - SPAM _ threshold (-1)

= 70 - (100 - 65) = 50%
70 - 0

In this example, only 50 percent of the pages at the T0 threshold will have sufficient space to store the new record!

This equation is only approximate, since it assumes a normal distribution of page percentage fullness. In practice, the actual distribution page fullness tends not to follow a normal distribution.

The objective when selecting thresholds, is to choose values that will minimize the amount of physical I/O necessary to store new records in the database.

The objective when selecting thresholds, is to choose values that will minimize the amount of physical I/O necessary to store new records in the database. Since the record size, placement strategies, and update activity is likely to vary from area to area you must calculate and use different threshold values for each area in your database.

SPAM Algorithms

To determine the optimal threshold values, we will need to understand more about the free space search algorithm used by VAX DBMS.

MAX_FREE_LEN...

The actual amount of space that SPAMS are based on is called the "maximum free length". This value represents the amount of space available on a database page for storing user records (page size less page overhead). The calculation for the max_free_len is different for areas with snapshots and those areas without snapshots.

Pages with snapshots:

MAX_FREE_LEN =
(blocks-per-page * 512)
- (24 + 4 + 4 + 18 + 10)
or
(blocks-per-page * 512) - 60

Pages without snapshots:

MAX_FREE_LEN =
(blocks-per-page * 512)
- (24 + 4 + 10)
or
= (blocks-per-page * 512) - 38

The value of 512 is the number of bytes per block. The value of 24 is the number of bytes in the database page header. The first value of 4 is the number of bytes per line-index entry, the second value of 4 (pages with snapshots only) is the number of bytes per TSN (Transaction Sequence Numbers are used by snapshot transactions) index entry. The value of 18 (pages with snapshots only) is the number of bytes in the page footer. The value of 10 is the minimum number of bytes reserved for the SYSTEM record on each page.

Thus, any threshold value greater than 90% (such as the default T3 value of 95%) is useless!

SPAM thresholds are percentages of MAX_FREE_LEN, and are always rounded down to the next even byte. Thus, the MAX_FREE_LEN of a storage area with snapshots and a page size of 2 blocks per page is (1024 - 60) or 964 bytes. Using the default thresholds of 70, 85, and 95, the actual threshold byte-values are 674, 818, and 914 respectively. For an existing database, you can confirm these values by issuing the following command:

$ DBO/DUMP root-file-spec/OPT=DEBUG

For each storage area, look for the following symbols:

SPAM_T1 = 1st SPAM threshold in bytes

SPAM_T2 = 2nd SPAM threshold in bytes

SPAM_T3 = 3rd SPAM threshold in bytes

Determining required space...

Next, we need to understand how much space the DBCS will look for when storing a new record. The DBCS uses the following equation to determine the required space needed to store the new record:

REQUIRED_LEN =
MAX(record-len + 5, 10)
+ PAGE_PAD_LEN

The value of record-len is the size of the static data portion of the record, as defined by the storage schema. It is easy to determine this value for each of the record-types in your database by using DBO/DUMP/STORAGE/OPTION = DEBUG and searching for the value of the "Static data portion length" of each record-type.

PAGE_PAD_LEN...

The DBCS actually reserves a portion of each database page (based on the PAGE_PAD_LEN) for future record growth on the page. The value of PAGE_PAD_LEN is currently calculated as 10 percent of the physical page size (blocks-per-page * 512) * 0.10 -- Thus, any threshold value greater than 90% (such as the default T3 value of 95%) is useless!

The value of PAGE_PAD_LEN for each area can be verified by dumping the database root file as described above. Search for the value of PAGE_PAD_LEN for each storage area in your database.

As stated previously, the value used by the DBCS for the record length, is the length of the static data portion of the record. This value can be verified using DBO/DUMP/STORAGE/OPTION = DEBUG and looking for the value of the "Static data portion length" for each record-type in your database. This value does not include the length of known (automatic) set pointer clusters, or the length of dynamic storage items. In other words, the DBCS does not use the actual record-length of the storage segment -- the value used by the DBCS will always be less than the actual record length.

When looking at the SPAM pages, the DBCS looks for the largest SPAM threshold (in terms of bytes) that is less than or equal to MAX_FREE_LEN - REQUIRED_LEN.

Pages at the T3 threshold are never searched, since they cannot guarantee any free space. Pages at the T0 level are always searched, since they guarantee the "most" free space.

The following SPAM Selection Example Table illustrates what threshold value DBCS will search for using the default thresholds for a database area with snapshots and a page size of two blocks per page, and assuming a page pad length of ten percent (102 bytes):

Threshold Statistics

The objective when setting threshold values, is to minimize the number of pages the DBCS searches when storing new data. DBMS provides several statistics that can help you measure the effectiveness of your SPAM thresholds.

Record Statistics

The Record Statistic, Pages Checked, is the single most useful statistic for determining the overall effectiveness of your SPAM thresholds. This statistic shows the number of data pages that were checked before the DBCS found a page with sufficient space to store the record. By comparing the Pages Checked statistic, with the Records Stored statistic, you can get a very accurate picture of the effectiveness of your SPAM thresholds. In a well-tuned database, Pages Checked should be no more than 2 times Records Stored.

PIO Statistics

The PIO Statistic, "SPAM Page Request", provides one measure of the effectiveness of your SPAM strategy. A high number of SPAM page requests relative to the database activity may indicate that your SPAM interval is too low, and that the DBCS is having a difficult time locating free space for new records. This number will also increase when the storage area becomes over utilized.

SPAM Selection Example
Threshold
Level
Max Free Bytes Max Free Len Guar
Free
Bytes
Page
Pad
Len
Max Record Size

When Threshold is used
T3 914 964 0 102 = -107Never Searched!
T2 818 964 50 102 = -57Never Searched!
T1 674 964 146 102 = 39Searched only if the record-len <= 39
T0 0 964 290 102 = 183Always searched
GUARANTEED_FREE_BYTES = Max_Free_Len - Max_Free_Bytes(T-1)

PAGE_PAD_LEN = 10% of Page size in bytes

MAX_RECORD_SIZE = Guaranteed_Free_Bytes - Page_Pad_Len - 5

 

 

Specialized VAX DBMS training Seminars
from the leading VAX DBMS support specialists!

Call Software Concepts International today
for seminar registration and information!
(603) 879-9022

 


 

 

Is Your Database Backup
Strategy Really Effective?

(Can You Recover Your Database When Needed?)

By Digital Equipment Corporation's
Database Engineering and Documentation Group

 

VAX DBMS provides backup and restore utilities that offer many different backup options. A well-constructed database backup and recovery strategy will include a combination of these functions selected to match the needs of your particular environment. There is no single backup strategy that works best for all configurations. The wrong strategy could even cost more than it is worth.

There is no single backup strategy that works best for all configurations. The wrong strategy could even cost more than it is worth.

A proper database backup strategy should strive to achieve a balance between minimizing operational overhead and maximizing database integrity. Over the life of a database, you will perform many more backups than restores. Since you cannot predict when a disk device will fail, or when a database will become corrupt, you must continually backup your data as it changes. This article contains suggestions and recommendations on making frequent database backup operations easier and infrequent restore operations more reliable.

Rules of a Strategy

A database backup strategy should be customized for your environment. It should take into consideration the system workload, usage schedule, importance of data, and hardware environment of the database. However, there are some guidelines that apply to all databases. Incorporating these guidelines into your strategy will ensure more reliable and more cost effective backups.

Digital strongly recommends the following guidelines:

Always use after-image (AIJ) files

Digital strongly recommends the use of after-image journaling. As data modifications are made to your database, an .AIJ file captures the operations each transaction makes. This provides exceptional recovery capabilities because you will be able to restore your database to the same state it was in before the corruption or hardware failure occurred. Considering the balance between cost effectiveness and data reliability, .AIJ files typically cost 5-10% (or less) in overhead but provide essentially a continuous backup of your transactions.

You should not enable and disable after-image journaling at different times. To be effective, journaling must be used consistently. Leaving "holes" in your journal file defeats the purpose of this option. An .AIJ file should always be on a disk separate from all associated database files. If this disk is lost, you should restart the journaling and then immediately perform a full database backup. If the disk with the root file or storage area files is lost, you restore the lost files from backup and then use the .AIJ file to recover the most recent transactions. If either the root file and .AIJ files, or the storage area and .AIJ files were on the same disk, you will lose data if the disk failed.

NOTE:
After enabling after-image journaling, you should immediately perform a full database backup.
Failure to do so may result in a non-recoverable database.

The database reference manuals contain additional information on backing up and recovering .AIJ files.

Always maintain a backup schedule and log

A well-designed database backup and recovery strategy is useless if the strategy is not publicized and practiced. Operational personnel need to be aware of the importance of the backup methodology, and they need to be able to quickly access backup media in the event database recovery is necessary.

Accurate record-keeping and media storage are essential to a proper backup and recovery strategy. The backup schedule should include dates and times, type of backup and tape labels.

Occasionally store backups in an offsite location

Backups are performed to facilitate database restoration, if your primary media fails, once the problem is corrected. There are many possible sources of failure of your primary media, and some of these sources must be considered in your backup strategy. Also, always remember that the data in your database is much more valuable than the physical media it is stored on.

If a site catastrophe destroys your disks, would your backup media still be safe? While it may be convenient, leaving your backup tapes on top of the tape drive is not a good idea. You should store you backup media in a separate location -- whether offsite or just another area of the building. Although offsite storage is the safest method, access to an offsite archive will take longer than one on-site, and there may be security issues about removing confidential data from your site. A compromise might be to send backups offsite only once a month. Traditionally, you send the previous month's or week's backup offsite, keeping the most recent backup in a safe location on-site.

Magnetic tapes are an inexpensive, reliable, and reusable backup media. The environmental parameters for tapes are often broad and sometimes can lead to abuse. The audio tapes stored in your car can develop some hiss, and the video tapes in your home can drop out some colors, but if your backup tapes are damaged in this way, they are useless. You must follow the manufacturer's instructions on temperature, humidity, magnetic fields, and tape life span. A specially constructed tape library or archive is the safest storage strategy.

NOTE:

Tapes age and become less reliable over time. A good backup strategy precludes your ever needing to gamble on restoring a set of five-year-old tapes.

Do not use the VMS Backup Utility on a database

In the earliest versions of VAX DBMS, the VMS Backup Utility was a legitimate alternative to the database Backup and Restore Utilities. Today, there are many reasons not to use the VMS backup utility. For example:

In the earliest versions of VAX DBMS, the VMS Backup Utility was a legitimate alternative to the database Backup and Restore Utilities.

Today, there are many reasons
not to use the VMS backup utility.

The multi-threaded database backup utility is designed specifically for VAX DBMS and Rdb/VMS databases. Digital recommends that you always use the multi-threaded backup capabilities of the product when backing up a database to tape because the utility:

The ability of the database backup utility to write to multiple tape drives simultaneously depends on the degree of concurrency (i.e., the number of simultaneous writes) the number of drives, TCUs (controllers), HSC STI ports and how they have been cabled together.

Simultaneous writing to multiple tape derives also depends on specific versions of VAX DBMS and Rdb/VMS. The following table identifies what capabilities exist in the various versions of the database products.

DBMSRdb Tape Drive Capability
V4.3V4.1 Will give you the option of overriding the auto-configuration done by the backup utility.
V4.2V4.0 Will treat all drives that appear on the same HSC as not capable of simultaneous writes.
V4.1V3.1 Does not explicitly support TA9x drives, although the drives may work to some degree

Never use VMS file operations, such as COPY, RENAME and DELETE on database files.

Multi-threaded backup uses the same error recovery and group redundancy as the VMS Backup utility. Considering all factors in the database environment, it is significantly more reliable than VMS Backup.

Never use VMS file operations such as COPY, RENAME and DELETE on database files.

Always spread your database across several disks

A backup strategy should provide guidelines on backups and restores of your database. In addition, your overall database strategy should include safeguards for protecting your data from becoming corrupt and needing a restoration. While you cannot predict hardware failures, you can minimize their effects.

By spreading your database across several disks, you can eliminate having a single point of failure. Even if your database is not large enough to require multiple disks, Digital strongly recommends putting the root file on a disk separate from the storage are disks, and you should always put the .AIJ files on a separate disk from the rest of the database. In addition, balancing busy storage areas between different disks is often a good tuning suggestion, but it also makes sense from a safety point of view.

The VAX DBMS database reference manuals provide additional information on backing up individual storage areas of a database.

Whether you backup you database to tapes or to disk, your backup schedule could overlap the routine VMS backups of your system (especially if your database files share their disks with other users.) Because VMS backups often run at a high priority, you should consider waiting until they are finished before starting a database backup. There will be less chance of mixing up tapes, less contention for the disks, and, if your system has multiple tape drives, you can allocate all of them to a multi-threaded database backup.

Always consider your existing hardware configuration and system workload

The decision to backup your database either to tape or to disk will be determined primarily by your hardware environment and by personal preference.

There are a few suggestions to consider:

Determine the volatility and value of your data

The volatility of your database refers to how often data changes. If you can determine what data is changing, and how often, you can create a strategy based on backing up the most frequently changed storage areas more often. For example, the data in an inventory area could be updated several times an hour, an employee field could be updated at the end of every shift, and a list of suppliers might only change every few months. Determining and mapping this workload will help determine the needs of your site.

The time required to complete incremental backups, and especially storage area incremental backups, is a good indication of changes to a database. Consider that a percentage of data in a database changes daily. Calculate this percentage and perform a full backup when 10% of a database has changed. While it is difficult to calculate this figure, it is much simpler to observe that when incremental backups take longer to complete each day, a greater percentage of a database has been changed since the last full backup. In fact, because of the extra calculations involved, an incremental backup could take longer than a full backup. A large incremental restore can take much longer than a full restore. Through observation, you should be able to determine the break-even point for performing a full backup rather than an incremental backup.

For backup to tape, the
multi-threaded backup is optimized to proceed at
whatever speed the hardware configuration will allow.
Allocating more tape drives
decreases backup times proportionally.

In addition to the frequency of backups, a backup strategy should also consider the differing daily usage of a database. For example, a typical backup schedule might include incremental backups every night and a full backup on Friday night. However, if an examination of your usage shows that most changes are made on Wednesdays, they you should consider moving the full backup to Wednesday night. Similarly, if your database is accessed by users in multiple time zones, you might consider starting backups an hour or two later when the activity level has quieted down. Scheduling backups around user activities causes less impact on the users and results in more efficient backups.

Verify the integrity of your database before each full backup

A backup strategy is useless if the database being backed up is corrupt to begin with. VAX DBMS contains online utilities to verify the correctness of a database; these tools should be used prior to starting the database backup operation. No database backup operation should be allowed to proceed if the database has not been verified as consistent in all respects.

Verify the integrity of your backup media after each backup

One of the most common mistakes made is not verifying the integrity of the database backup media; at least, not until the backup is needed to be restored and then it is too late.

After a database is backed up, the backup media should be verified. Verification ensures:

There are several methods available to verify the integrity of the database backup media. However, only one method is guaranteed to accurately reflect the integrity of the backup media; The actual restoration or recovery of a database. Anything else is merely verifying the media, not the contents of the media.

NOTE

Never, ever, test the database restoration procedures by first deleting the source database. Restore or recover the database to a distinct file system, preferably on a separate machine.

Speeding Up Backups

For backup to tape, the multi-threaded backup is optimized to proceed at whatever speed the hardware configuration will allow. Allocating more tape drives decreases backup times proportionally. You can also save some operator time by formatting all of your tapes ahead of time.

For backing up a database directly to disk, use the single-threaded backup utility, and set up RMS to have a large extend size. This reduces both fragmentation and the need to repeatedly expand the backup file. For example, RMS can allocate 50,000 blocks much faster than it can allocate 5000 blocks 100 times. This extend time becomes significant when compared to the amount of time for a simple write operation. You want to minimize the number of times the file extends while not setting it too large for your available disk space. If you can accurately calculate the final size of the resulting backup file, you can use the SET RMS_DEFAULT command in DCL to optimize the default extend size.

For example, assume that a backup file is 1,005 blocks. You cannot directly specify the initial size of the file, but it is typically quite small (typically 3 blocks in most VMS systems). If you set the default extend to 1,000 blocks, the backup utility starts writing, gets to the end-of-file (after 3 blocks) and extends the file to 1,003 blocks. The backup utility again gets to the end-of-file and expands the 1,003 block file by 1,000 blocks, giving 2,003 blocks. When the database backup finishes, the file is truncated down to 1,005 blocks, but while the file is being written, 998 wasted blocks are allocated.

It is sometimes possible to estimate the backup file size by analyzing your database to determine how full each storage area is. For example, use the space utilization summary pages generated by VAX DBMS DBO/ANALYZE/NOSET command.

NOTE

This method does not always provide a valid estimate. For example, consider what happens if you have a storage area with 100 pages, and there is one record on each page: Every page will be backed up. If you take that same area and put all 100 records on one page, only that one page will be backed up. The storage area is just as full in both cases.

A simpler, and more reliable, method is to check the final size of the backup file the next time you perform a full database backup. If you set up your backups to run from a command procedure, be sure to consider future growth in your calculations.

 

 


ReloadExpert v1.1

New England Software Concept's ReloadExpert version 1.1
offers many new features at a new lower price!

SCI has enhanced its ReloadExpert optimizing reload script generator with the introduction of version 1.1. This new version offers enhanced functionality at a new lower price!

The Need for Reloads...

Periodically reloading your VAX DBMS database is needed to maintain optimal database performance.

Productivity Tool

ReloadExpert automates the reload process and ensures that your reload strategy is truly optimized and always up-to-date! Imagine creating a truly optimized VAX DBMS reload routine in a matter of minutes!

Minimize Time to Reload

Reloading a production database can be a time-consuming process. You cannot afford an inefficient reload strategy.

ReloadExpert minimizes the time required to reload your database by producing truly optimized reload scripts and by making optimal use of system resources. ReloadExpert recommends optimal SYSGEN and UAF values for reloading your database based on your system's resources.

Accurate Reload Scripts

Your business depends on the data in your database. An erroneous reload script can seriously damage the integrity of your data. With ReloadExpert you can rest easy knowing that every set-type in your database has been carefully analyzed to ensure that the proper relationships will be maintained during the reload process.

Automatic Restarts

Imagine creating a truly optimized VAX DBMS reload routine
in a matter of minutes!

Should your system fail during an unload or load processes, ReloadExpert can automatically restart the unload or load at the point of failure, thus saving you from restarting the reload from the beginning!

Complete Set of Tools

ReloadExpert generates everything you need to efficiently reload your database, including:

Load and Unload Format Language File (LFL/UFL); defines the record format of the unload files, and all set-significant information needed to maintain proper set relationships.

Tape Support

ReloadExpert now supports the use of tape for reloading your database. You can unload portions of your database directly to tape, and then use the same tape during the load process.

Parallel Unloads

An optimized reload routine means less down-time. ReloadExpert allows you to unload portions of your database in parallel, thus reducing the overall reload time.

Prolog and Epilog routines

You can now write your own routines that ReloadExpert will execute immediately before and/or after each unload and load sequence to customize your reload. You can write routines to mount tapes, backup portions of your database, alter the placement of the unload RMS files, and many other uses!

User Defined Keys

Some database schema definitions do not guarantee unique keys for owner-record occurrences. ReloadExpert allows you to augment the schema definition by specifying User Defined Key (UDK) files that ReloadExpert uses to generate the reload scripts. In fact, ReloadExpert will even generate a template UDK file that can be easily edited to meet the unique needs of your database.

DBO LOAD facility

ReloadExpert generates scripts for the DBO/UNLOAD and DBO/LOAD facility.

This provides the following advantages:

Licensing

The ReloadExpert distribution license is only $4,995. Additional nodes can be licensed with the right-to-copy license for only $2,495. These are one-time license fees, and are independent of node-size.

*VAX DBMS is a trademark of
Digital Equipment Corp.

*ReloadExpert is a trademark of
Software Concepts International

 

FOR MORE INFORMATION

New England
Software Concepts

Nashua, NH 03063
(603) 879-9022

 


 

  

Phone

Fax

E-Mail

(603) 879-9022

(603) 879-9023


Copyright © 1996 Software Concepts International Inc. All rights reserved.