Version: v4.10 Stable

Restore the Kine Database from a Snapshot

info

This feature is available from the Platform version v4.8.0

This runbook covers restoring the shared Kine database from an RDS snapshot to a new instance and updating both platform regions to use it. Use this procedure for disaster recovery, database migration (for example, enabling IAM authentication), or point-in-time recovery.

Important

Both platform regions must be scaled down before switching the data source to prevent split-brain writes to the old and new databases.

Configure your values

This runbook references AWS resource IDs, cluster context ARNs, and file names specific to your deployment. Set them below once and all commands update automatically.

Expand to set page variables

Modify the following with your specific values to replace on the whole page and generate copyable commands:

ACCOUNT_ID

AWS_REGION

DB_SG_ID

FIRST_REGION_CONTEXT

SECOND_REGION_CONTEXT

SNAPSHOT_NAME

NEW_DB_INSTANCE_ID

FIRST_REGION_VALUES_FILE

SECOND_REGION_VALUES_FILE

CHART_VERSION

REPLICA_COUNT

Step 1 - Create a snapshot of the current database

Skip this step if you already have a snapshot to restore from.

Modify the following with your specific values to generate a copyable command:

CURRENT_DB_INSTANCE_ID

aws rds create-db-snapshot \
--db-instance-identifier mariadb-multi-region \
--db-snapshot-identifier kine-backup-YYYY-MM-DD \
--region us-east-1

Wait for the snapshot to become available:

aws rds wait db-snapshot-available \
--db-snapshot-identifier kine-backup-YYYY-MM-DD \
--region us-east-1

Step 2 - Restore the snapshot to a new RDS instance

Restore the snapshot to a new instance in the database VPC. Use the same DB subnet group and security group from the original setup. Include --enable-iam-database-authentication if the new instance should use IAM auth.

aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier mariadb-multi-region-restored \
--db-snapshot-identifier kine-backup-YYYY-MM-DD \
--db-instance-class db.t3.medium \
--db-subnet-group-name multi-region-db-subnet \
--vpc-security-group-ids sg-xxxxxxxxx \
--no-publicly-accessible \
--enable-iam-database-authentication \
--region us-east-1

Wait for the new instance to become available:

aws rds wait db-instance-available \
--db-instance-identifier mariadb-multi-region-restored \
--region us-east-1

Note the new endpoint:

aws rds describe-db-instances \
--db-instance-identifier mariadb-multi-region-restored \
--query 'DBInstances[0].Endpoint.Address' \
--output text \
--region us-east-1

If using IAM authentication, note the DbiResourceId of the new instance and update the RDSIAMAuthKine IAM policy to include it. Without this, platform pods fail with Access denied for user 'kine' errors because the IAM rds-db:connect permission is scoped to a specific RDS instance resource ID.

aws rds describe-db-instances \
--db-instance-identifier mariadb-multi-region-restored \
--query 'DBInstances[0].DbiResourceId' \
--output text \
--region us-east-1

Add the new resource ID to the policy's Resource array:

Modify the following with your specific values to generate a copyable command:

OLD_DBI_RESOURCE_ID

NEW_DBI_RESOURCE_ID

aws iam create-policy-version \
--policy-arn arn:aws:iam::123456789012:policy/RDSIAMAuthKine \
--set-as-default \
--policy-document '{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "rds-db:connect",
      "Resource": [
        "arn:aws:rds-db:us-east-1:123456789012:dbuser:db-OLDXXXXXXXXXXXXXXXXXXXXXXXXXX/kine",
        "arn:aws:rds-db:us-east-1:123456789012:dbuser:db-NEWXXXXXXXXXXXXXXXXXXXXXXXXXX/kine"
      ]
    }
  ]
}'

Step 3 - Scale down both regions

Scale both platform deployments to zero to stop all writes to the old database.

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
scale deployment -n vcluster-platform loft --replicas=0

kubectl --context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
scale deployment -n vcluster-platform loft --replicas=0

Wait for all pods to stop:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
get pods -n vcluster-platform -l app=loft --watch

kubectl --context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
get pods -n vcluster-platform -l app=loft --watch

Step 4 - Update the values files

Update the dataSource in both region values files to point to the new RDS endpoint:

Modify the following with your specific values to generate a copyable command:

NEW_DATABASE_URL

config:
  database:
    dataSource: "mysql://kine@tcp(mariadb-multi-region-restored.xxxxxxxxxxxx.us-east-1.rds.amazonaws.com:3306)/kine"

Step 5 - Upgrade both regions

Apply the updated values files to both regions.

vCluster CLI
Helm

vcluster platform start \
--namespace vcluster-platform \
--kube-context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
--values platform-us-east-1-values.yaml \
--upgrade \
--no-tunnel

vcluster platform start \
--namespace vcluster-platform \
--kube-context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
--values platform-eu-west-1-values.yaml \
--upgrade \
--no-tunnel

helm upgrade loft vcluster-platform --install --create-namespace --repository-config='' \
--namespace vcluster-platform \
--repo "https://charts.loft.sh/" \
--version 4.8.0 \
--kube-context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
-f platform-us-east-1-values.yaml \
--server-side=true --force-conflicts

helm upgrade loft vcluster-platform --install --create-namespace --repository-config='' \
--namespace vcluster-platform \
--repo "https://charts.loft.sh/" \
--version 4.8.0 \
--kube-context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
-f platform-eu-west-1-values.yaml \
--server-side=true --force-conflicts

Step 6 - Scale up both regions

The Helm upgrade keeps replicas at zero because it doesn't override the manual scale-down. Scale both regions back up.

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
scale deployment -n vcluster-platform loft --replicas=3

kubectl --context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
scale deployment -n vcluster-platform loft --replicas=3

Wait for all pods to become ready:

kubectl --context arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 \
rollout status deployment/loft -n vcluster-platform

kubectl --context arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1 \
rollout status deployment/loft -n vcluster-platform

Step 7 - Verify the restore

Confirm the platform is healthy on both regions:

for CTX in arn:aws:eks:us-east-1:123456789012:cluster/platform-multi-region-us-east-1 arn:aws:eks:eu-west-1:123456789012:cluster/platform-multi-region-eu-west-1; do
echo "=== $CTX ==="
kubectl --context "$CTX" get pods -n vcluster-platform -l app=loft
echo
done

Verify the platform UI is accessible through the shared DNS domain and that both Route 53 health checks return healthy.

Configure your values​

Step 1 - Create a snapshot of the current database​

Step 2 - Restore the snapshot to a new RDS instance​

Step 3 - Scale down both regions​

Step 4 - Update the values files​

Step 5 - Upgrade both regions​

Step 6 - Scale up both regions​

Step 7 - Verify the restore​

Configure your values

Step 1 - Create a snapshot of the current database

Step 2 - Restore the snapshot to a new RDS instance

Step 3 - Scale down both regions

Step 4 - Update the values files

Step 5 - Upgrade both regions

Step 6 - Scale up both regions

Step 7 - Verify the restore