Implementing Cloud Object Storage Solutions – Part Two

Disclaimer:
Due to the frequency in which the cloud is changing, it is important to note the original publish date for this article is August 25, 2021 and as a result, things may have changed by the time you review this.

Welcome back for part two of my Cloud Object Storage Deep Dive! If you missed it, in part one we looked at the characteristics that make up the cloud object storage offerings of the main three cloud providers (Microsoft Azure, Amazon Web Services/AWS, Google Cloud Platform/GCP). 

Whilst this content will align to public cloud object storage that Veeam currently supports (AWS, Azure, GCP and IBM Cloud), keep in mind that Veeam has currently 26 Veeam Ready – Object storage platforms via their partners. This often is an on-premises deployment, much of the logic applies there too, so don’t overlook the on-premises object storage options.

Now that we understand what to look for from our cloud providers and the features they have available, we’ll look at what information we need to gather to build our own cost and performance estimates to implement object storage ourselves.

What are our important metrics?

Backup size

Primarily, you’ll be backing up data to the cloud and keeping a static copy for your required retention policy, so it’s important to understand both your full backup requirements and the size of your incremental backups as well. This information can be gathered efficiently by navigating to the Veeam Backup & Replication console, then selecting Home, and choosing the backups section, then right-click the backup job you’d like to copy/move to the cloud and choose properties. You’ll see a screen similar to the image below:

Veeam backup properties window

The key information provided is the original “Data Size” which is the source virtual machine’s size, the backup size, the deduplication ratio achieved, and the compression ratio achieved. When there are multiple VMs and multiple restore points, it’s possible to click through each job run for each VM to gather these metrics.

Block size

Now that we know the size of our backups, full and incremental, we’re done right? Wrong! Whilst this is a key piece of data, we also need to know the structure of the data. The data block size. Within a Veeam Backup & Replication job, under the “Storage > Advanced > Storage” section, you can define a data block size from the “Storage Optimization” section.

Advanced Settings’ Storage tab

This block size relates to the size of the blocks that Veeam is reading at a time from your VM. Full details of these blocks are outside of the scope of this article, but consider these main points:

The options available and the relative block sizes are listed below:

Type:

Block Size:

Local Target (Large Blocks)

4096KB

Local Target

1024KB

LAN Target

512KB

WAN Target

256KB

These block sizes are important, and choosing one comes with a trade-off, each block is one object and results in an API call, so by utilising smaller blocks such as “WAN Target,” you’ll be able to reduce your data footprint to a smaller size, but increased API cost may impact the economic model. More information can be found here. Therefore, each block size will have it’s own characteristics with data put in object storage.

Data tiers

Veeam supports using the cloud as an archive tier and gives you the option to store GFS backups onto an archive tier (Currently for Azure and AWS). As discussed in part one though, archive tiers tend to have minimum retention requirements from the cloud providers and have financial penalties for deleting data retained for a shorter lifecycle. Veeam has safeguards you can put in place to prevent this. When configuring your archive tier, you define at what age backups should be migrated to the archive tier, with the option to also process backups that, based on your retention policy for GFS backups, will meet the minimums specified by your cloud provider.

Storage settings section of the Scale-out Backup Repository Archive Tier

The ability to offload backups to the archive tier regardless of compliance with the cloud provider’s minimum retention policy was a strongly requested community feature as often offloading data to the archive tier and then deleting the data shorter than the required minimum and incurring the penalty was still cheaper than using the capacity tier for the required length of time, but be sure to check before you disable this setting. More information on the capacity tier is here. More information on the archive tier is here.

How do we calculate our costs based on these metrics?

To create our realistic estimations, lets put this information to work. The best way to review this is to look at different object storage scenarios and the metrics involved in these operations.

Scenario one: Backing up data to object storage

When we back up data, we incur the following costs:

Scenario two: Restoring data from object storage

By recovering data, we incur the following costs:

Veeam attempts to reduce the cost of data retrieval from object storage by fetching any blocks that are stored locally, from local storage, even if the required backup isn’t stored locally in its entirety.

Scenario three: Retention policy processing

Eventually, your backups will be aged to the point that they’re no longer required, when this happens, it’s possible to incur costs.

Conclusion

We’ve now seen how settings within Veeam can impact the efficiency of using object storage and how different operations can incur costs. In part three we’ll perform some benchmarks to provide real figures to highlight the impact these changes can make to the performance and cost of using object storage.

 

Related Content:

Exit mobile version