When the piano was invented, musicians noticed that it had become much easier to produce a sound with the instrument. But pianists were expected to play many more sounds than a classic instrument. Complexity was just moving to another place.
Kubernetes makes application deployment easier, yet application features require an increasing number of components and requirements. A good example of this is the IBM blockchain network, in which the trust on the transaction can be distributed between organizations.
Four organizations — R1, R2, R3 and R4 — have jointly decided and entered an agreement to set up and exploit a Hyperledger Fabric network.
Without entering all the details, this application is complex to backup and restore. It creates :
- A least 11 PVCs that must be backed up and restored in the right order.
- More than 12 deployments if we include the client application.
- Multiple secrets, configmaps, service accounts, services and routes.
- Multiple IBP custom resources managed by an operator that creates the deployment and PVCs.
IBM features an IBP console that helps with deploying peers, oredererers and ca nodes on the network. It also helps to manage identities and channels, and visualize the transaction on the ledger. This UI is very handy, but constantly creates new PVCs, deployments and route services as the network grows, all of which must be backed up regularly. Additionally, the operator must also be managed by the Lifecycle Operator on OpenShift, which creates even more complexity.
Without a tool such as Kasten K10 by Veeam to protect and restore such applications, backup and restore processes are challenging. In this post, we’ll demonstrate the power of Kasten K10 using this scenario as an example.
Assumptions
In this tutorial, we assume that you have installed Kasten on OpenShift. (If that’s not the case, read this blog post.)
My base domain on OpenShift will be michael-1.aws.kasten.io.
We will restore the application without the Operator Lifecycle Manager, to demonstrate restoration is possible on a cluster that does not exist on vanilla Kubernetes.
Step 1: Create a Blockchain Network
The first step is to create the blockchain namespace:
oc new-project my-blockchain
Next, we install the blockchain operator, browse to the Red Hat Marketplace and log in or create a new account. Then, we will register the cluster to the Red Hat marketplace.
Now, we’ll go to the Operator Hub. In the search bar, type “blockchain” to load the blockchain tile.
Step 2: Apply the Security Context Constraint
The next step is to copy and save the following security context constraint object to the local system as ibp-scc.yaml:
cat <<EOF | oc apply -f -
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegeEscalation: true
allowPrivilegedContainer: true
allowedCapabilities:
- NET_BIND_SERVICE
- CHOWN
- DAC_OVERRIDE
- SETGID
- SETUID
- FOWNER
apiVersion: security.openshift.io/v1
defaultAddCapabilities: []
fsGroup:
type: RunAsAny
groups:
- system:serviceaccounts:my-blockchain
kind: SecurityContextConstraints
metadata:
name: my-blockchain
readOnlyRootFilesystem: false
requiredDropCapabilities: []
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
supplementalGroups:
type: RunAsAny
volumes:
- "*"
EOF
Then, we will run the following commands to add the file to the cluster, and add the constraint to the project:
oc adm policy add-scc-to-user my-blockchain system:serviceaccounts:my-blockchain
When the command is successful, we will see a response that is similar to the following example:
securitycontextconstraints.security.openshift.io/my-blockchain created
scc "blockchain-project" added to: ["system:serviceaccounts:my-blockchain"]
Step 3: Deploy the IBM Blockchain Platform console
There are four instances available listed under “Provided APIs”:
- IBP CA (Advanced users): Deploys an instance of an IBM Blockchain Platform CA.
- IBP Console: The IBM Blockchain Platform console UI, or “console”, is an award-winning user interface for building your blockchain network.
- IBP Orderer (Advanced users): Deploys an instance of an IBM Blockchain Platform ordering service.
- IBP Peer (Advanced users): Deploys an instance of an IBM Blockchain Platform peer.
Step 4: Create the Instance on the IBPConsole Tile
apiVersion: ibp.com/v1beta1
kind: IBPConsole
metadata:
name: ibpconsole
namespace: my-blockchain
labels:
app.kubernetes.io/name: "ibp"
app.kubernetes.io/instance: "ibp"
app.kubernetes.io/managed-by: "ibm-ibp"
spec:
email: michael@kasten.io
password: ultrasecurepassword
imagePullSecrets:
- regcred
registryURL: cp.icr.io/cp
license:
accept: true
networkinfo:
domain: apps.michael-1.aws.kasten.io
storage:
console:
class: ''
size: 5Gi
serviceAccountName: ibm-blockchain
version: 2.5.1
See https://cloud.ibm.com/docs/blockchain-sw-251?topic=blockchain-sw-251-deploy-ocp-rhm#console-deploy-ocp-rhm-advanced for more advanced options.
Step 5: Verify the Console Installation and Log In
oc get deployment -n my-blockchain
NAME READY UP-TO-DATE AVAILABLE AGE
ibp-operator 1/1 1 1 34m
ibpconsole 1/1 1 1 16m
All deployment are in the ready state:
oc get po
NAME READY STATUS RESTARTS AGE
ibp-operator-7f896d6644-sh2mx 1/1 Running 0 36m
ibpconsole-6f94ddc6f9-t9khx 4/4 Running 0 18moc get route
NAME HOST/PORT
PATH SERVICES PORT TERMINATION WILDCARD
ibpconsole-console
my-blockchain-ibpconsole-console.apps.michael-1.aws.kasten.io
ibpconsole optools passthrough None
ibpconsole-proxy
my-blockchain-ibpconsole-proxy.apps.michael-1.aws.kasten.io
ibpconsole optools passthrough None
We connect to https://my-blockchain-ibpconsole-console.apps.michael-1.aws.kasten.io
and use our email and password provided in the ibpconsole michael@kasten.io and ultrasecurepassword. We then change the password as requested by the UI and reconnect:
Step 6: Create a Blockchain Network
We will create a minimal network for testing purposes following this tutorial.
At this point, we can run backup and recovery on the first two blocks:
But it’s more interesting to have transactions to make sure we are able to retrieve them upon restore.
We can use this tutorial to create a nodejs basic smartcontract
Here’s a summary of the steps:
- First we install the VS Code extension for the IBM Blockchain Platform.
- Next, we follow the basic tutorial in VS Code to create a smartcontract in nodejs call js-contract.
- Then we export the contract in .cds format, which is compatible with the 1.4-compatible channel created in the previous steps.
- Finally, we install and instantiate the js-contract on channel 1.
All these operations create some transactions on the ledger that we can use now to test the restoration process:
It could also be interesting to connect from VS Code to the platform and create some transactions, but this is beyond the scope of this blog post.
What’s Available for Backup and Restore?
At this point, we have these pods:
oc get po
NAME READY STATUS RESTARTS AGE
chaincode-execution-8d91fc09-df12-4c91-8b1b-6239b7947e23 1/1 Running 0 64mibp-operator-7f896d6644-sh2mx 1/1 Running 0 22h
ibpconsole-6f94ddc6f9-t9khx 4/4 Running 0 22h
orderingserviceca-b64685fc9-2q6dl 1/1 Running 0 15h
orderingservicenode1-f7bb5f9f-2w29b 2/2 Running 0 14h
org1ca-7fd68c8cc-ggvpq 1/1 Running 0 16h
peerorg1-58b48f49f9-mwp2j 4/4 Running 0 15h
We also have these deployments:
oc get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
ibp-operator 1/1 1 1 22h
ibpconsole 1/1 1 1 22h
orderingserviceca 1/1 1 1 15h
orderingservicenode1 1/1 1 1 14h
org1ca 1/1 1 1 16h
peerorg1 1/1 1 1 15h
Note that the chaincode-execution pod is not linked to any deployment.
Here are the PVCs…
oc get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
ibpconsole-pvc Bound pvc-07638d8f-1e9d-4fb8-a4d7-5dc3a7a05840 5Gi RWO gp2-csi 22h
orderingserviceca-pvc Bound pvc-b71c1051-d538-4eff-9e0f-937497125dec 20Gi RWO gp2-csi 15h
orderingservicenode1-pvc Bound pvc-3748a0cd-995e-4701-bbb9-484634345b5f 100Gi RWO gp2-csi 14h
org1ca-pvc Bound pvc-d46a10b3-f9d7-4bbb-8a71-d4574aa05baa 20Gi RWO gp2-csi 16h
peerorg1-pvc Bound pvc-e5575337-064f-4e0c-8e8f-1745655b8529 100Gi RWO gp2-csi 15h
peerorg1-statedb-pvc Bound pvc-e18e0783-8608-4668-9640-7f411726c6bd 10Gi RWO gp2-csi 15h
…and the corresponding CRD mapping to ibp objects:
oc get ibpca,ibpconsole,ibporderer,ibppeer
NAME READY
ibpca.ibp.com/orderingserviceca
ibpca.ibp.com/org1ca
NAME READY
ibpconsole.ibp.com/ibpconsole
NAME READY
ibporderer.ibp.com/orderingservice
ibporderer.ibp.com/orderingservicenode1
NAME READY
ibppeer.ibp.com/peerorg1
Here are the Operator Lyfecycle Manager elements provided to install the global blockchain application:
oc get operatorgroup,subscription,csv
NAME AGE
operatorgroup.operators.coreos.com/my-blockchain-jh4jr 23h
NAME PACKAGE SOURCE CHANNEL
subscription.operators.coreos.com/ibm-blockchain ibm-blockchain ibm-operator-catalog v2.5
NAME DISPLAY VERSION REPLACES PHASE
clusterserviceversion.operators.coreos.com/ibm-blockchain.v2.5.1 IBM Blockchain 2.5.1 Succeeded
The Operator Lifecycle Manager created the clusterrole and clusterrolebinding during install, to let the ibp-operator manage pods, deployment and IBP custom resources. These need to be captured at cluster resource scope level:
oc get clusterrole | grep ibp
ibpcas.ibp.com-v1beta1-admin 2021-03-29T14:30:16Z
ibpcas.ibp.com-v1beta1-crdview 2021-03-29T14:30:17Z
ibpcas.ibp.com-v1beta1-edit 2021-03-29T14:30:16Z
ibpcas.ibp.com-v1beta1-view 2021-03-29T14:30:16Z
ibpconsoles.ibp.com-v1beta1-admin 2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-crdview 2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-edit 2021-03-29T14:30:17Z
ibpconsoles.ibp.com-v1beta1-view 2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-admin 2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-crdview 2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-edit 2021-03-29T14:30:17Z
ibporderers.ibp.com-v1beta1-view 2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-admin 2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-crdview 2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-edit 2021-03-29T14:30:17Z
ibppeers.ibp.com-v1beta1-view 2021-03-29T14:30:17Z
oc get clusterrole | grep ibm
ibm-blockchain.v2.5.2-6bdc5f6d8
oc get clusterrolebinding | grep ibm
ibm-blockchain.v2.5.2-6bdc5f6d8
Restoration Constraints
- Topology constraint
To accomplish high resiliency, the blockchain operator decides on a very precise topology breakdown of the pods around zones. When needed, the operator can create a zone label on the PVC that corresponds to the zone of the actual PVC:
oc get pvc -L zone
NAME STATUS VOLUME ZONE
ibpconsole-pvc Bound pvc-a08bd96f-bc41-432f-ac96-26c4850b81ce
orderingserviceca-pvc Bound pvc-f3ca0d28-012a-4a5a-a793-474045bac3ce eu-west-1a
orderingservicenode1-pvc Bound pvc-161921ca-3fcf-490a-b49b-5ab88ee6bc41
org1ca-pvc Bound pvc-308f93b1-eb45-439a-8808-1e3d79550b46 eu-west-1a
peerorg1-pvc Bound pvc-d0318054-af5f-44a4-b49c-7f1cd5d4e56e eu-west-1b
peerorg1-statedb-pvc Bound pvc-2ac27a9c-a571-488a-a28e-d38f90054a42 eu-west-1b
To reinforce this, an affinity rule is also set up on the deployment itself:
oc get deploy peerorg1 -o jsonpath='{.spec.template.spec.affinity}' | jq
{
"nodeAffinity": {
"requiredDuringSchedulingIgnoredDuringExecution": {
"nodeSelectorTerms": [
{
"matchExpressions": [
{
"key": "topology.kubernetes.io/zone",
"operator": "In",
"values": [
"eu-west-1b"
]
},
{
"key": "topology.kubernetes.io/region",
"operator": "In",
"values": [
"eu-west-1"
]
}
]
},
{
"matchExpressions": [
{
"key": "failure-domain.beta.kubernetes.io/zone",
"operator": "In",
"values": [
"eu-west-1b"
]
},
{
"key": "failure-domain.beta.kubernetes.io/region",
"operator": "In",
"values": [
"eu-west-1"
]
}
]
}
]
}
},
"podAntiAffinity": {
"preferredDuringSchedulingIgnoredDuringExecution": [
{
"podAffinityTerm": {
"labelSelector": {
"matchExpressions": [
{
"key": "orgname",
"operator": "In",
"values": [
"org1msp"
]
}
]
},
"topologyKey": "kubernetes.io/hostname"
},
"weight": 100
}
]
}
}
That means that when restoring the PVI, we must make sure we restore it to the right zone.
- Data Consistency Constraint
To understand the data consistency constraint, we must first understand the transaction flow:
Every peer pod holds 2 PVCs:
- The ledger PVC (100Gi), or the materialised blockchain, made of block and transactions inside the blocks that have been validated.
- The couchdb PVC (10Gi) that represents the state of the ledger with not yet validated transactions. There are also some transactions validated, but not yet committed to the ledger.
It’s important to make sure that the couchdb PVC does not have validated transactions that the ledger does not have. IBM recommends taking a snapshot of the couchdb PVC at 3:00 a.m. and a snapshot of the ledger PVC at 3:05 a.m.
Every orderer pod has a ledger PVC, but contrary to a public blockchain that uses proof of work to validate the blockchain, private blockchains use orderers to order the validated block.
We must be sure that peers do not have validated transactions that the orderer does not have. This time, we are not speaking about consistency inside a pod, but consistency inside a wide network. IBM recommends taking a snapshot of the PVC orderer at 5:00 a.m. (2 hours after the snapshot of the peers).
There are also CA pods and an IBP Console pod with a single PVC. These should be backed up each time the network topology changes. To make it simpler, let’s back them up at 5:00 a.m. with the orderer.
Here’s a summary of the scheduled backups:
PEER COUCHDB PVC |
3:00 A.M. |
Peer Ledger PVC |
3:05 a.m. |
Orderer Ledger PVC |
5:00 a.m. |
CA and IBPConsole PVC |
5:00 a.m. |
Implementing the Backup Strategy
We can implement the backup strategy easily by creating:
- One daily policy that backs up the entire namespace at 3:00 a.m. and 5:00 a.m.
- One daily policy that backs up the entire namespace at 3:05 a.m.
- One weekly cluster resource policy (to capture clusterrole and clusterrolebinding).
With this schedule, we have everything we need to restore the application consistently.
Restoring the Application After a Disaster
Let’s remove the my-blockchain application to emulate the disaster:
oc delete ns my-blockchain
Step 1: Recreate the PVC from the Different Restore Points
First, we’ll recreate the empty namespace:
oc create ns my-blockchain
We’ll use the partial restore capacity of Kasten K10 to pick up the different PVCs for the various restore points:
RESTOREPOINT |
PVC |
3:00 |
peerorg1-pvc |
3:05 |
peerorg1-statedb-pvc |
5:00 |
orderingservicenode1-pvc |
|
The console pvc and the rest of CA pvc ibpconsole-pvc orderingserviceca-pvc org1ca-pvc |
This image shows how to restore only the peerorg1-pvc from the 3:00 a.m. restore point:
Step 2: ReCreate the PVC in the Right Zone
Before launching the restore, we need to apply a transform to make sure the PVC is recreated in the right zone. When the blockchain operator defines precisely in which zone the PVC has been created, it add a “zone” label to the PVC and an affinity constraint to the deployment using this PVC:
oc get pvc peerorg1-pvc -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
labels:
...
zone: eu-west-1b
name: peerorg1-pvc
...oc get deploy peerorg1 -o yaml
apiVersion: apps/v1
kind: Deployment
metadata:
...
name: peerorg1
namespace: my-blockchain
….
spec:
....
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- eu-west-1b
- key: topology.kubernetes.io/region
operator: In
values:
- eu-west-1
- matchExpressions:
- key: failure-domain.beta.kubernetes.io/zone
operator: In
values:
- eu-west-1b
- key: failure-domain.beta.kubernetes.io/region
operator: In
values:
- eu-west-1
...
We will use this label to recreate the PVC in the right zone. Then, we’ll create a storageclass per zone:
oc get sc eu-west-1a -o yaml
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
- key: topology.ebs.csi.aws.com/zone
values:
- eu-west-1a
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: eu-west-1a
parameters:
encrypted: "true"
type: gp2
provisioner: ebs.csi.aws.com
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
...
We’ll use a transform to copy the zone name in the storageClassName in the PVC spec:
Then, we’ll apply this transform for each of the three restore points.
Step 3: Restore the Rest of the namespace (with Some Exclusions)
Now we have the namespace with only the PVCs. We must bring back the rest of the namespace, but without:
- The PVCs that we already got back
- The chaincode pods (they we’ll be recreated at smartcontract invoke)
- The configmap ibp-operator-lock
- The elements of the Operator Lifecycle Manager:
- ClusterServiceVersion
- Subscription
- OperatorGroup
- InstallPlan
We also need to scale down all the deployments. This is because Kasten K10 will restore the CRD after all deployments are up and running. However, if those deployments rely on this CRD to work, it’s best to scale down everything so that Kasten K10 considers everything successful and moves to CRD restoration. We’ll scale up afterwards.
Step 4: Restore the clusterrole and clusterrolebinding from the Cluster Restore Point
When we deleted the namespace, the Operator Lifecycle Manager automatically deleted the clusterrole and clusterrolebinding created for the IBM Blockchain operator:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
creationTimestamp: "2021-03-31T23:09:32Z"
labels:
olm.owner: ibm-blockchain.v2.5.1
olm.owner.kind: ClusterServiceVersion
olm.owner.namespace: my-blockchain
operators.coreos.com/ibm-blockchain.my-blockchain: ""
name: ibm-blockchain.v2.5.1-6bdc5f6d8
rules:
- apiGroups:
- apiextensions.k8s.io
resources:
- persistentvolumeclaims
- persistentvolumes
verbs:
- '*'
...
The olm.owner label has OLM to remove this object if the ClusterServiceVersion is removed, and we decide to restore without the OLM objects. To achieve our restore, we need to remove the labels section.
We retrieve the clusterrole and clusterrolebinding from the cluster restore point and apply a transform to remove the labels section:
If we were restoring in a new cluster, we would have to restore the other my-blockhain clusterrole and rolebinding created by the deployment of the operator by OLM, as well:
Restart the Deployment
We can now restart the deployment and check that everything is recovered and that all the pods are back:
for dep in $(oc get deploy -o name); do oc scale $dep --replicas=1; doneoc get pods
NAME READY STATUS RESTARTS AGE
ibp-operator-7dd4bfb76f-s89rx 1/1 Running 0 2d8h
ibpconsole-cbf76d57d-2xwx9 4/4 Running 0 2d8h
orderingserviceca-78fb95747c-dvk4k 1/1 Running 0 2d8h
orderingservicenode1-646dd846bc-94fwm 2/2 Running 0 2d8h
org1ca-75845db8bf-ts6tf 1/1 Running 0 2d8h
peerorg1-7758946784-kqcrp 4/4 Running 0 2d8h
The best way to check that your data is recovered is to check the block transactions:
We can verify that we got back all the blocks and the transaction. We were also able to restart the whole blockchain application, within the data consistency and topology constraints.
Conclusion
With this article, we’ve shown how you can more easily perform a complex backup and restoration process with Kasten K10. Kasten has all the necessary features to make your backup and restore possible, as long as you understand how your application works.
Try Kasten K10 for free today.