Restore snapshots
There are multiple ways to back up and restore a tenant cluster. vCluster provides a built-in method to create and restore snapshots using its CLI.
External databases such as MySQL or PostgreSQL running outside the vCluster namespace require a separate restore procedure. Refer to the relevant database documentation.
A vCluster snapshot includes:
- Backing store data (for example, etcd or SQLite)
- vCluster Helm release information
- vCluster configuration (for example,
vcluster.yaml)
For tenant clusters with private nodes, you may need to follow additional steps.
Restore existing tenant cluster from a snapshot​
Restoring from a snapshot pauses the vCluster, scales down all workload pods to 0, and launches a temporary restore pod. Once the restore completes, vCluster resumes and scales all workload pods back up. This process results in temporary downtime while the restore is in progress.
If the restore fails while using vcluster restore commands, the process stops. You must retry the restore to avoid leaving the tenant cluster in an inconsistent or broken state.
Restore a vCluster using the following commands. You can use any snapshot option and set the snapshot URL with different credentials.
Restore tenant clusters from OCI​
- Local credentials
- URL credentials
vcluster restore my-vcluster "oci://ghcr.io/my-user/my-repo:my-tag"
export OCI_USERNAME=my-username
export OCI_PASSWORD=$(echo -n "my-password" | base64)
vcluster restore my-vcluster "oci://ghcr.io/my-user/my-repo:my-tag?username=$OCI_USERNAME&password=$OCI_PASSWORD&skip-client-credentials=true"
Restore tenant clusters from S3​
vcluster restore my-vcluster "s3://my-s3-bucket/my-snapshot-key"
Restore tenant clusters from Azure Blob​
- Azure CLI
- SAS token
- Storage account key
vcluster restore my-vcluster "https://myaccount.blob.core.windows.net/my-container/my-snapshot.tar.gz" \
--azure-subscription-id my-subscription-id \
--azure-resource-group my-resource-group
vcluster restore my-vcluster "https://myaccount.blob.core.windows.net/my-container/my-snapshot.tar.gz?sv=2022-11-02&ss=b&srt=co&sp=rwlacuptfx&se=2027-01-01T00:00:00Z&sig=YOUR_SAS_SIGNATURE"
export AZURE_STORAGE_KEY=my-storage-account-key
vcluster restore my-vcluster "https://myaccount.blob.core.windows.net/my-container/my-snapshot.tar.gz"
Restore tenant clusters from container​
vcluster restore my-vcluster "container:///data/my-snapshot.tar.gz"
When restoring your tenant cluster, you can also restore CSI-provisioned PVCs by specifying --restore-volumes flag:
# Restore from a local PVC snapshot (if using embedded storage).
vcluster restore my-vcluster "container:///data/my-snapshot.tar.gz" --restore-volumes
See Volume snapshots for more details.
Restore a standalone vCluster​
For standalone vCluster running as a systemd service, use the --standalone flag instead of a cluster name. This flag can't be used with --driver docker. The CLI automatically stops the vCluster service, restores the backing store data from the snapshot, and restarts the service.
vcluster restore --standalone "container:///var/lib/vcluster/my-snapshot.tar.gz"
You can also restore from any external storage backend:
vcluster restore --standalone "oci://ghcr.io/my-user/my-repo:my-tag"
Restoring to a different machine leaves stale node details in etcd that require manual cleanup. To make the snapshot available on another node, use an external storage backend (OCI, S3, or Azure Blob Storage). Plan to manually remove stale node resources after restore.
Restore a standalone vCluster in HA mode​
Restoring a standalone vCluster configured with multiple etcd peers (HA mode) requires manual intervention. Shut down all nodes except the restore node, then run the restore on that node.
The following steps assume a three-node setup where you run the restore on node A:
- Stop the vCluster service on all nodes except the restore node:
# Run on nodes B and C
systemctl stop vcluster
- Run the restore on node A:
vcluster restore --standalone "snapshot-url"
- Delete the stale node objects for the stopped nodes from the restored cluster:
kubectl delete node <node-B> <node-C>
- On nodes B and C, move the old vCluster state aside so they rejoin cleanly:
# Run on nodes B and C
mv /var/lib/vcluster /var/lib/vcluster-bcp
- On node A, create a join token:
vcluster token create --control-plane
- Use the join script output to rejoin nodes B and C one at a time. Wait for each node to reach
Readybefore joining the next to rebuild etcd quorum sequentially.
Clone a tenant cluster to a new tenant cluster​
You can use snapshots to clone an existing tenant cluster and create a new tenant cluster from that snapshot. When creating a new tenant cluster from a snapshot, it also restores all workloads in the tenant cluster.
If the restore fails while using vcluster create, vCluster automatically deletes the new tenant cluster.
# Create a new tenant cluster from an OCI snapshot (uses local credentials).
vcluster create my-vcluster --restore oci://ghcr.io/my-user/my-repo:my-tag
vCluster certificates change when you create a tenant cluster with a new name or namespace. This is expected, as tenant clusters shouldn't share certificates.
Migrate and override vCluster configuration options to create a new tenant cluster​
When upgrading tenant clusters, there are a couple of configuration options that aren't supported to change on an existing tenant cluster. For example, the backing store can't be changed. To change other options, migrate the tenant cluster by creating a new one from a snapshot and applying updated configuration.
When creating a new tenant cluster from a snapshot, it also restores all workloads in the tenant cluster.
If the restore fails while using vcluster create, vCluster automatically deletes the new tenant cluster.
# Upgrade an existing vCluster by restoring from a snapshot and applying a new vcluster.yaml.
# Configuration options in the vcluster.yaml override the options from the snapshot.
vcluster create my-vcluster --upgrade -f vcluster.yaml --restore oci://ghcr.io/my-user/my-repo:my-tag
vCluster certificates change when you create a tenant cluster with a new name or namespace. This is expected, as tenant clusters shouldn't share certificates.
Supported migration options​
vCluster supports migration paths based on your setup. The following migration options are available for Kubernetes distributions and backing stores.
Change the backing store​
Change your data store to improve efficiency, scalability, and Kubernetes compatibility. You can migrate between the following data stores:
- Embedded database (SQLite) -> Embedded database (etcd)
- Embedded database (SQLite) -> External database
vCluster overrides all other configuration options, similar to upgrading a tenant cluster and applying changes.
Limitations​
When taking snapshots and restoring tenant clusters, there are limitations:
Sleeping tenant clusters
- Snapshots require a running vCluster control plane and don't work with sleeping tenant clusters.
Tenant clusters using the k0s distro
- Use the
--pod-execflag to take a snapshot of a k0s tenant cluster. - k0s tenant clusters don't support restore or clone operations. Migrate them to k8s instead.
Tenant clusters using an external database
- Tenant clusters with an external database handle backup and restore outside of vCluster. A database administrator must back up or restore the external database according to the database documentation. Avoid using the vCluster CLI backup and restore commands for clusters with an external database.
vcluster.yaml configuration
- Although the vcluster.yaml configuration is backed up, it isn't automatically restored. After restoring a tenant cluster, you must manually reapply your vcluster.yaml configuration.
Standalone: restore to a new node
- If you restore a snapshot on a different node, the standalone instance's etcd contains node details from the original machine. You must manually clean up stale node resources after restore (see Use snapshots with private nodes).
Standalone: tenant cluster workloads aren't restored
- A standalone snapshot only restores the standalone instance's own etcd data. Each tenant cluster running on the standalone instance has its own etcd, so workloads deployed inside tenant clusters aren't restored by a standalone snapshot.
Use snapshots with private nodes​
The snapshot also includes node resources. Restoring to a different set of nodes requires manual steps for tenant clusters using private nodes. With host nodes, vCluster automatically updates node information when nodes change between snapshot and restore.
Nodes removed between snapshot and restore​
When nodes exist in the snapshot but don't exist in the current tenant cluster, manually delete those nodes from the restored tenant cluster.
The node shows up in kubectl get nodes but doesn't physically exist in the tenant cluster.
Run kubectl delete node [name] for each removed node.
export NODE_NAME=my-node
kubectl delete node $NODE_NAME
Nodes added between snapshot and restore​
When nodes joined the cluster after the snapshot, they don't exist in the restored tenant cluster and aren't listed in kubectl get nodes.
Re-join each private node with the --force-join flag.
export VCLUSTER_NAME=my-vcluster
# Connect to your vcluster
vcluster connect $VCLUSTER_NAME
# Create a token
vcluster token create --expires=1h
Append the --force-join flag to the output command before running it on the worker node.
curl -sfLk https://vcluster-endpoint/node/join?token=token | sh - -- --force-join