Etcd deployment built to run on Fly.
fly launch
Note: Before running the command above, ensure the mount destination within the
fly.toml
is set to the/data
directory. If this is not set correctly, the deploy will fail.
While technically possible to scale your Etcd app to multiple members simultaneously, it's recommended to scale in increments of one until you've reached your target cluster size.
When scaling, monitor your logs for errors and ensure your cluster is healthy before performing any subsequent scaling operations.
fly machines clone <machine-id>
This clone command is preferred over fly scale count N
as it enforces unique zones for volume placement. Newly provisioned members will automatically join an existing cluster.
-
Identify the Member
id
andname
of the member you want to remove.SSH into one of the member machines and use these helper commands:
# View endpoint status flyadmin endpoint status
# List all members flyadmin member list
-
If the Member is the leader, transfer leadership.
etcdctl move-leader <target-member-id>
-
Stop the Machine (easier in a separate terminal session)
fly machine stop <machine-id>
-
Remove member from the cluster
flyadmin member remove <member-id>
-
Clone an existing member
fly machine clone <machine-id>
If the following secrets are set, automatic backups will be performed and uploaded to S3:
Static credentials:
AWS_SECRET_ACCESS_KEY
AWS_ACCESS_KEY_ID
AWS_REGION
OIDC-based auth:
AWS_ROLE_ARN
AWS_REGION
Optional environment variables:
S3_BUCKET (default: fly-etcd-backups)
BACKUP_INTERVAL (default: "1h")
flyadmin backup list
flyadmin backup create
-
Scale cluster down to a single member
# Stop a non-leader Machine fly m stop <machine-id> # Remove the associated member from the cluster flyadmin member remove <member-id>
-
Select which backup you'd like to restore
List the backups with
flyadmin backup list
and identify the ID/Version you'd like to restore from. -
Initiate the restore
WARNING: This will erase existing data
flyadmin b restore <backup-id>
-
Restart the machine
fly m restart <machine-id>
-
Scale Etcd cluster back up to 3 nodes
fly m clone <machine-id>
-
Verify cluster status
flyadmin endpoint status