Many times when I explain how Veeam backups work, people have questions about how data is moved for incremental backups. In addition, there are frequent questions about the differences between incremental and differential backups. In this post, I’ll explain these differences and show what Veeam does and why the options are what they are today.
First, there are no differential backups with Veeam. A differential backup is somewhat of a carry-over from the disk-to-tape era where there is a full backup with a comprehensive collection of the changes. A restore would have a full backup tape and a second tape with all changes in the form of a differential backup. This would help in avoiding tape changes (multiple tapes required for a restore), which is another relic from the disk-to-tape era. The figure below visualizes a differential backup:
The differential backup would make it so that the tape set with the full backup and the tape set with the differential backup could access a restore point. It may be the same tape or may be a pool of tapes, but historically when tape was the first place backups landed, this was a common arrangement. However, the implementation varied widely. This was designed to avoid swapping multiple tapes (say a week’s worth) to do a single restore.
With the generally accepted practice using disk-to-disk backups for the first part of a backup, the industry and Veeam have clearly aligned to use incremental backups for a number of reasons. Specifically with Veeam, there are two types of incremental backup methods: forward and reverse incremental backups. These two backup methods also leverage one important new technique called changed block tracking (CBT). Veeam leverages CBT for both VMware vSphere and Microsoft Hyper-V backups.
The forward incremental backup with Veeam is the default to write backups on disk. With a forward incremental backup, the virtual machine’s new, unique blocks are transferred to a separate file — the .vib file — after the full backup has been taken (Sunday in the example below). This is repeated with each restore point created, indicated as daily backups in the figure below. One additional technique that is used is the synthetic full backup. This is a great way to reduce the stress on primary storage and still have a complete restore point in one file. The synthetic full backup is created by reading the full backup contained in the .vbk file and the incremental backups to create a full backup file with the contents already in the backup repository. The forward backup is shown below:
Veeam recommends the reverse incremental backup for general-purpose storage, such as holding your backups on a SAN or direct-attached storage devices. The latest restore point is always a full backup file in this example, and the incremental backups are reflected by the previous restore points. The incremental process in this example happens by copying the unique blocks of the VM and then injecting that into the full backup. The blocks that represent the previous restore point are put in a reverse incremental file .vrb that will exist for each restore point in the backup chain. The diagram below shows this process:
To bring this all into perspective, Veeam Backup & Replication v7 introduced tape support, allowing users to write backups to tape either from a backup to tape job or a file-to-tape job. The process of how files are moved to tape is explained in the helpcenter. This means that the nature of the full backup and its associated restore points are kept intact. However, Veeam Backup & Replication v9 did introduce GFS (grandfather, father, son) retention on tape, which will consolidate the number of tapes required to meet weekly, monthly, quarterly or annual retention requirements.
For today’s Availability requirements, the incremental approach is by far preferred. For one, the transfer for each restore point (in any Veeam example) is only the unique blocks since the last backup. In the differential approach, systems with high changes rates would transfer all new data since the last full backup. This can cause increased backup times and stress on primary storage. What factors go into your decision process to select a backup method? Share your comments below.