IBM Support

The Abuse of Backup Copies as Archives

Technical Blog Post


Abstract

The Abuse of Backup Copies as Archives

Body

I've talked to several customers who have taken up the bad habit of keeping their backup copiesfor several years for "compliance reasons".

In my post last year [Lost In Translation], I talked about the different meanings of archive:

In explaining the word "archive" we came up with two separate Japanese words. One was "katazukeru", and the other was "shimau". If you are clearing the dinner plates from the table after your meal, for example, it could be done for two reasons. Both words mean "to put away", but the motivation that drives this activity changes the word usage. The first reason, katazukeru, is because the table is important, you need the table to be empty or less cluttered to use it for something else, perhaps play some card game, work on arts and craft, or pay your bills. The second reason, shimau, is because the plates are important, perhaps they are your best tableware, used only for holidays or special occasions only, and you don't want to risk having them broken. As it turns out, IBM supports both senses of the word archive. We offer "space management" when the space on the table, (or disk or database), is more important, so older low-access data can be moved off to less expensive disk or tape. We also offer "data retention" where the data itself is valuable, and must be kept on WORM or non-erasable, non-rewriteable storage to meet business or government regulatory compliance.

The process of archiving your data from primary disk to alternate storage media can satisfy both motivations.

IBM offers software specifically to help with this archival process.For email archive, IBM offers [IBM CommonStore] for Lotus Domino and MicrosoftExchange. For database archive, including support for various ERP and CRM applications, IBM offers [IBM Optim] from the acquisition of Princeton Softech.

The problems occur when companies, under the excuse of simplification or consolidation, feel they can just usetheir backups as archives. They are taking daily backups of their email repositories and databases, and keepingthese for seven to ten years. But what happens when their legal e-discovery team needs to find all emails or database records related to a particular situation, an employee, client or account? Good luck! Most backupsare not indexed for this purpose, so storage admins are stuck restoring many different backups to temporary storage and combing through the files in hopes to find the right data.

Backups are intended for operational recovery of data that is lost or corrupted as a result of hardware failures, application defects, or human error. Disk mirroring or remote replication might help with hardware failures, but any logical deletion or corruption of data is immediately duplicated, so it is not a complete solution. FlashCopy or Snapshot point-in-time copies are useful to go back a short time to recover from logical failures, but since they are usually on the same hardware as the original copies, may not protect against hardware failures. And then there's tape, and while many people malign tape as a backup storage choice, 71 percent of customers send backups to tape, according to a 2007 Forrester Research report.

Backups often aren't viable unless restored to the same hardware platform, with the same operating system and application software to make sense of the ones and zeros. For this reason, people typically only keep two to five backup versions, for no more than 30 days, to support operational recovery scenarios. If you make updatesto your hardware, OS or application software, be sure to remember to take fresh new backups, as the old backupsmay no longer apply.

Archives are different. Often, these are copies that have been "hardened" or "fossilized" so that they make sense even if the original hardware, OS or application software is unavailable. They might be indexed so that they can be searched, so that you only have to retrieve exactly the data you are looking for. Finally, they are often stored with "rendering tools" that are able to display the data using your standard web browser, eliminating the need to have a fully working application environment.

Take any backup you might have from five years ago and try to retrieve the information. Can you do it? This might be a real eye-opener. You might have inherited this backup-as-also-archive approach from someone else, and are trying to figure out what to do differently that makes more sense. Call IBM, we can help.

technorati tags: , , , , , , , , , , , , , , , , , ,

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW206","label":"Storage Systems"},"Component":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

UID

ibm16161691