HomeSearchSite MapContact Us

   RemoteDBA Services for CODASYL DBMS and Rdb Remote Management of OpenVMS Systems Preserving Mission Critical IT Applications Through Virtualization

DATABASE
ADMINISTRATION

 
 

Is Your Database Backup Strategy Really Effective?
TECHNICAL ARTICLES -> MAINTENANCE ARTICLES
Back ] Next ] Printer Friendly Version ]

Oracle Rdb and Oracle CODASYL DBMS provide backup and restore utilities that offer many different backup options. A well-constructed database backup and recovery strategy will include a combination of these functions selected to match the needs of your particular environment. There is no single backup strategy that works best for all configurations. The wrong strategy could even cost more than it is worth.

 

A proper database backup strategy should strive to achieve a balance between minimizing operational overhead and maximizing database integrity. Over the life of a database, you will perform many more backups than restores. Since you cannot predict when a disk device will fail, or when a database will become corrupt, you must continually backup your data as it changes. This article contains suggestions and recommendations on making frequent database backup operations easier and infrequent restore operations more reliable.

 

RULES OF A STRATEGY

A database backup strategy should be customized for your environment. It should take into consideration the system workload, usage schedule, importance of data, and hardware environment of the database. However, there are some guidelines that apply to all databases. Incorporating these guidelines into your strategy will ensure more reliable and more cost effective backups. Oracle strongly recommends the following guidelines:

 

  • Always use after-image journal (.AIJ) FILES.

  • Always maintain a log of your backup schedule.

  • Occasionally store backups in a separate location.

  • Do not use the VMS Backup Utility on a database.

  • Always spread your database across several disks.

  • Always consider your existing hardware configuration and system workload.

  • Determine the volatility and value of your data.

  • Verify the integrity of your database before each full backup.

  • Verify the integrity of your backup media after each backup.

 

ALWAYS USE AFTER-IMAGE (AIJ) FILES

Oralce strongly recommends the use of after-image journaling. As data modifications are made to your database, an .AIJ file captures the operations each transaction makes. This provides exceptional recovery capabilities because you will be able to restore your database to the same state it was in before the corruption or hardware failure occurred. Considering the balance between cost effectiveness and data reliability, .AIJ files typically cost 5-10% (or less) in overhead but provide essentially a continuous backup of your transactions.

 

You should not enable and disable after-image journaling at different times. To be effective, journaling must be used consistently. Leaving "holes" in your journal file defeats the purpose of this option. An .AIJ file should always be on a disk separate from all associated database files. If this disk is lost, you should restart the journaling and then immediately perform a full database backup. If the disk with the root file or storage area files is lost, you restore the lost files from backup and then use the .AIJ file to recover the most recent transactions. If either the root file and .AIJ files, or the storage area and .AIJ files were on the same disk, you will lose data if the disk failed. The database reference manuals contain additional information on backing up and recovering .AIJ files.

 
ALWAYS MAINTAIN A BACKUP SCHEDULE AND LOG

A well-designed database backup and recovery strategy is useless if the strategy is not publicized and practiced. Operational personnel need to be aware of the importance of the backup methodology, and they need to be able to quickly access backup media in the event database recovery is necessary. Accurate record-keeping and media storage are essential to a proper backup and recovery strategy. The backup schedule should include dates and times, type of backup and tape labels.

 
OCCASIONALLY STORE BACKUPS IN AN OFFSITE LOCATION

Backups are performed to facilitate database restoration, if your primary media fails, once the problem is corrected. There are many possible sources of failure of your primary media, and some of these sources must be considered in your backup strategy. Also, always remember that the data in your database is much more valuable than the physical media it is stored on.

 

If a site catastrophe destroys your disks, would your backup media still be safe? While it may be convenient, leaving your backup tapes on top of the tape drive is not a good idea. You should store you backup media in a separate location whether offsite or just another area of the building. Although offsite storage is the safest method, access to an offsite archive will take longer than one on-site, and there may be security issues about removing confidential data from your site. A compromise might be to send backups offsite only once a month. Traditionally, you send the previous month's or week's backup offsite, keeping the most recent backup in a safe location on-site.

 

Magnetic tapes are an inexpensive, reliable, and reusable backup media. The environmental parameters for tapes are often broad and sometimes can lead to abuse. The audio tapes stored in your car can develop some hiss, and the video tapes in your home can drop out some colors, but if your backup tapes are damaged in this way, they are useless. You must follow the manufacturer's instructions on temperature, humidity, magnetic fields, and tape life span. A specially constructed tape library or archive is the safest storage strategy.

 

DO NOT USE THE VMS BACKUP UTILITY ON A DATABASE

In the earliest versions of DBMS, the VMS Backup Utility was a legitimate alternative to the database Backup and Restore Utilities. Today, there are many reasons not to use the VMS backup utility. For example:

 

  • VMS backup does not consider database free space and wastes time and tape backing up unused space in the database.

  • VMS backup does not understand the relationships between files. Databases spread across several devices would need special attention to ensure that the correct database root file, .AIJ and storage area files are backed up safely and that they are synchronized.

  • VMS backup does not provide the means for a database to be backed up safely when the database is in use. You would need to shut down your database during backups to ensure that no transactions occurred during the backup operation; this is clearly unacceptable for 7x24 applications.

  • VMS backup may lock database files, thus preventing access to the database for large periods of time.

 

The multi-threaded database backup utility is designed specifically for Rdb and DBMS databases. We recommend that you always use the multi-threaded backup capabilities of the product when backing up a a DBMS database (Rdb always use the multithreaded backup) to tape because the utility:

 

  • Operates faster than VMS Backup because it has been specifically optimized for databases.

  • Uses less tape than VMS backup because it does not backup free space.

  • Understands file dependencies and relationships.

  • Can write to multiple tape devices simultaneously, reducing the total amount of backup time.

  • Allows backups to be accomplished without shutting down the database.

 

The ability of the database backup utility to write to multiple tape drives simultaneously depends on the degree of concurrency (i.e., the number of simultaneous writes) the number of drives, TCUs (controllers), HSC STI ports and how they have been cabled together. Simultaneous writing to multiple tape derives also depends on specific versions of Rdb and DBMS. The following table identifies what capabilities exist in the various versions of the database products.

 

DBMS Rdb Tape Drive Capability
V4.3

V4.1

Will give you the option of overriding the auto-configuration done by the backup utility.
V4.2 V4.0 Will treat all drives that appear on the same HSC as not capable of simultaneous writes.
V4.1 V3.1 Does not explicitly support TA9x drives, although the drives may work to some degree.

 

Multi-threaded backup uses the same error recovery and group redundancy as the VMS Backup utility. Considering all factors in the database environment, it is significantly more reliable than VMS Backup. Never use VMS file operations such as COPY, RENAME and DELETE on database files.

 
ALWAYS SPREAD YOUR DATABASE ACROSS SEVERAL DISKS

A backup strategy should provide guidelines on backups and restores of your database. In addition, your overall database strategy should include safeguards for protecting your data from becoming corrupt and needing a restoration. While you cannot predict hardware failures, you can minimize their effects.

 

By spreading your database across several disks, you can eliminate having a single point of failure even if your database is not large enough to require multiple disks. We strongly recommend that you should always put the .AIJ files on a separate disk from the rest of the database. In addition, balancing busy storage areas between different disks is often a good tuning suggestion, but it also makes sense from a safety point of view. The Rdb and DBMS database reference manuals provide additional information on backing up individual storage areas of a database.

 

Whether you backup your database to tapes or to disk, your backup schedule could overlap the routine VMS backups of your system (especially if your database files share their disks with other users.) Because VMS backups often run at a high priority, you should consider waiting until they are finished before starting a database backup. There will be less chance of mixing up tapes, less contention for the disks, and, if your system has multiple tape drives, you can allocate all of them to a multi-threaded database backup.

 
ALWAYS CONSIDER YOUR EXISTING HARDWARE CONFIGURATION & SYSTEM WORKLOAD

The decision to backup your database either to tape or to disk will be determined primarily by your hardware environment and by personal preference. Here are a few suggestions to consider:

 

  • Rather than backup to disk and then to tape, it is often faster and more reliable to backup directly to tape.

  • Multi-threaded backup to tape is usually faster than a single-threaded backup to disk. If you can allocate two tape drives, the backups complete in about half of the normal time. Using three drives completes the task in about a third of the time. In some configurations, the speed of multi-threaded backups is limited more by the VAX CI (Cluster Interconnect) port than by the speed of the tape drives.

  • Historically, disk media is more reliable and faster than tapes. However, multi-threaded backup uses redundancy blocks which are more reliable than the single verify pass you get with a disk.

  • Tapes are cheaper and easier to store than removable disk media. You can afford to keep more old backup versions around before you recycle the tapes. Try to maintain a minimum of three (3) versions of full database backups, since a single past version is not sufficient for a reliable backup strategy.

  • If you backup to non-removable disk media, you can probably only keep one past set of backups, which is also insufficient for a reliable backup strategy.

 
DETERMINE THE VOLATILITY AND VALUE OF YOUR DATA

The volatility of your database refers to how often data changes. If you can determine what data is changing, and how often, you can create a strategy based on backing up the most frequently changed storage areas more often. For example, the data in an inventory area could be updated several times an hour, an employee field could be updated at the end of every shift, and a list of suppliers might only change every few months. Determining and mapping this workload will help determine the needs of your site.

 

The time required to complete incremental backups, and especially storage area incremental backups, is a good indication of changes to a database. Consider that a percentage of data in a database changes daily. Calculate this percentage and perform a full backup when 10% of a database has changed. While it is difficult to calculate this figure, it is much simpler to observe that when incremental backups take longer to complete each day, a greater percentage of a database has been changed since the last full backup. In fact, because of the extra calculations involved, an incremental backup could take longer than a full backup. A large incremental restore can take much longer than a full restore. Through observation, you should be able to determine the break-even point for performing a full backup rather than an incremental backup.

 

In addition to the frequency of backups, a backup strategy should also consider the differing daily usage of a database. For example, a typical backup schedule might include incremental backups every night and a full backup on Friday night. However, if an examination of your usage shows that most changes are made on Wednesdays, they you should consider moving the full backup to Wednesday night. Similarly, if your database is accessed by users in multiple time zones, you might consider starting backups an hour or two later when the activity level has quieted down. Scheduling backups around user activities causes less impact on the users and results in more efficient backups.

 
VERIFY THE INTEGRITY OF YOUR DATABASE BEFORE EACH FULL BACKUP

A backup strategy is useless if the database being backed up is corrupt to begin with. Rdb and DBMS contain online utilities to verify the correctness of a database; these tools should be used prior to starting the database backup operation. No database backup operation should be allowed to proceed if the database has not been verified as consistent in all respects.

 
VERIFY THE INTEGRITY OF YOUR BACKUP MEDIA AFTER EACH BACKUP

One of the most common mistakes made is not verifying the integrity of the database backup media; at least, not until the backup is needed to be restored and then it is too late. After a database is backed up, the backup media should be verified. Verification ensures:

 

  • that the backup strategy works as intended.

  • that the backup media is verifiable.

  • that the backup media is restorable.

 

There are several methods available to verify the integrity of the database backup media. However, only one method is guaranteed to accurately reflect the integrity of the backup media; the actual restoration or recovery of a database. Anything else is merely verifying the media, not the contents of the media.

 
SPEEDING UP BACKUPS

For backup to tape, the multi-threaded backup is optimized to proceed at whatever speed the hardware configuration will allow. Allocating more tape drives decreases backup times proportionally. You can also save some operator time by formatting all of your tapes ahead of time.

 

For backing up a database directly to disk, use the single-threaded backup utility, and set up RMS to have a large extend size. This reduces both fragmentation and the need to repeatedly expand the backup file. For example, RMS can allocate 50,000 blocks much faster than it can allocate 5,000 blocks 100 times. This extend time becomes significant when compared to the amount of time for a simple write operation. You want to minimize the number of times the file extends while not setting it too large for your available disk space. If you can accurately calculate the final size of the resulting backup file, you can use the SET RMS_DEFAULT command in DCL to optimize the default extend size.

 

For example, assume that a backup file is 1,005 blocks. You cannot directly specify the initial size of the file, but it is typically quite small (typically 3 blocks in most VMS systems). If you set the default extend to 1,000 blocks, the backup utility starts writing, gets to the end-of-file (after 3 blocks) and extends the file to 1,003 blocks. The backup utility again gets to the end-of-file and expands the 1,003 block file by 1,000 blocks, giving 2,003 blocks. When the database backup finishes, the file is truncated down to 1,005 blocks, but while the file is being written, 998 wasted blocks are allocated.

 

It is sometimes possible to estimate the backup file size by analyzing your database to determine how full each storage area is. For example, use the space utilization summary pages generated by DBO/ANALYZE/NOSET command.

 

A simpler, and more reliable, method is to check the final size of the backup file the next time you perform a full database backup. If you set up your backups to run from a command procedure, be sure to consider future growth in your calculations.

 

**Note:  This article was originally submitted by Digital for publication for SCI Newsletter, DBAdvisor, Issue #2 1992.

 

Top


Was this article helpful? 
This information and more was originally published in SCI's technical newsletter, DBAdvisor.
Past issues are available online; you may also fill out a subscription form.

If you have a technical question about this article or about any other 
CODASYL DBMS or Rdb topic, then ask our experts.

How would you rate this article?

5 (Highest)
4
3
2

0 (Did not apply)
Comments about this article or topic suggestions for new articles

Copyright © 2008 Software Concepts International
All Rights Reserved