Kanister Simplifies Application-level Data Operations on Kubernetes 

IT departments have many choices for infrastructure and application deployment. Organizations choose containers for their portability, scalability and deployment speed, among other benefits. Adopting a cloud-native approach offers additional benefits such as increased agility, lower CAPEX costs, and improved scalability and reliability.  

For teams working with Kubernetes, ensuring that all data, particularly application data, is protected can be a tricky proposition. With each version of Kubernetes release, users have seen improvements in running stateful workloads. However, on its own, Kubernetes lacks robust application data management capabilities.  

Lifecycles and workflows behind cloud-native applications can be complex. Kubernetes currently allows admins to manage data in several ways: storage-centric snapshots, storage-centric snapshots with hooks or APIs into the application, and/or leveraging data services. 

Admins may take a storage-centric approach or a data-centric approach using tools such as mysqldump or pg_dump. Briefly, each has its own pros and cons: 

Let’s talk a little about the application-centric approach to data operations in Kubernetes. 

An Application-centric Approach with Kanister

For DevOps teams using Kubernetes, Kanister is an open-source project that allows domain experts to capture application specific data management tasks in blueprints that can be easily shared and extended. First posted on GitHub almost three years ago, Kanister takes care of the tedious details around application data management on Kubernetes and presents a homogeneous operational experience across applications at scale.

Kanister comprises four primary components:

  1. The Kanister controller: An operator based on the Kubernetes operator pattern, that helps to manage Blueprints, ActionSets and Profiles.
  2. Blueprints: Custom resources used to define workflows for operations such as backup, restore or delete. Essentially, they provide the ability to hook into the data service(s).
  3. ActionSets: Custom resources used to execute a specific action from a specific Blueprint.

Profiles: These determine the destination for backups or the source for restores (i.e., AWS, Azure blob storage or another target).

(at 10:05 in video; different iterations show flow between components)

Kanister offers two additional tools, Kanctl and Kando. Kanctl can be used to create ActionSets and Profiles. Kando helps move data to and from the object store within the container.

In this informative video, Kasten by Veeam’s Pavan Navarathna discusses data management challenges in Kubernetes and provides a demonstration of Kanister with real data. Users familiar with YAML can easily jump in and try it themselves. The demo was staged in conjunction with the DoKC, “an openly governed group of curious and experienced practitioners, taking inspiration from the CNCF and Apache Software Foundation.” DoKC’s goal is to “assist in the emergence and development of techniques for the use of Kubernetes for data.”

In addition to the current capabilities of Kanister, the team is planning to add a guide for writing Blueprints. There are a number of Blueprints for various popular databases, but for those users using a database that doesn’t currently have a Blueprint, the guide should help write one. Also in the works is a plan to add file storage as a destination for backups, and add encryption, compression and deduplication for the data being moved.

Interested in trying Kanister? Visit Kanister.io to learn more or download or fork it on GitHub.

New
5Kubernetes Backup Best Practices
Exit mobile version