Net App Article Sample
Deduplication is only as reliable as the underlying hardware and software. In fact, although it may not be immediately obvious, with deduplication, reliability becomes even more crucial.
For example, suppose that you run a fairly standard backup schedule with nightly incrementals and weekly full backups. Now suppose that you create a file at the beginning of the month and then make no changes to it. With traditional backups, you’ll have four copies of the file at the end of the month; one for each week’s full backup. If you need to restore the file at that point, the probability is high that you’ll be able to restore at least one of the four copies, even if your backup medium is not reliable.
But when you bring deduplication into the picture, at the end of the month you have only one physical copy of the file—the copy created by the first full backup—plus three sets of pointers to the same file blocks. Based on this simple example, it should be clear that you want to be sure your deduplicated backups have been reliably stored on resilient hardware with good RAID protection. Over the course of a year you could have hundreds of backups that actually reference most of the same data blocks.