Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to restore ownerReferences #833

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions docs/Restore.md
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
# Restore Procedures

## Prequisites
Performing a restore assumes the follwing.
Performing a restore assumes the following:
- OADP operator is installed and configured correctly.
- You are utilising the same namespace structure as the backup.
- THe operator is currently disabled or not present on the cluster.
- The operator is currently disabled, but installed on a cluster. Needs to be enabled short after restore process is started to claim persistent volumes.

## Disable operator
If the operator is installed an you wish to perform a restore please use the following command to scale down the operator deployment.
If the operator is installed and you wish to perform a restore, use the following command to scale down the operator deployment.

```sh
oc scale deploy rhtas-operator-controller-manager --replicas=0 -n openshift-operators
```

Once restore operations have completed you can reactivate the operator by scaling back up its deployment.
Once restore operations are running, you can reactivate the operator by scaling back up its deployment - without enabling the operator persistent volumes are not claimed.

```sh
oc scale deploy rhtas-operator-controller-manager --replicas=1 -n openshift-operators
```

## Cluster restore
If the cluster you are performing the restore action on is the same cluster as the original backup the following Restore Example should suffice.
If the cluster you are performing the restore action on is the same cluster as the original backup, the following Restore Example should suffice.

```sh
cat << EOF > ./RestoreExample.yaml
Expand All @@ -40,6 +40,7 @@ spec:
- fulcio.rhtas.redhat.com
- rekor.rhtas.redhat.com
- tuf.rhtas.redhat.com
- timestampauthority.rhtas.redhat.com
excludedResources:
- pod
- deployment
Expand All @@ -62,6 +63,17 @@ EOF
oc apply -f RestoreExample.yaml
```

If the restore is done on a different cluster, few more steps needs to be done. First, delete the secret for Trillian DB which will be recreated by operator,
and restart the pod:

```sh
oc delete secret securesign-sample-trillian-db-tls
oc delete pod trillian-db-xxx
```

After the restore process is finished and all the pods are running, run the [restoreOwnerReferences.sh](../hack/restoreOwnerReferences.sh) script to recreate
ownerReferences, which were lost on a new cluster, as the owner has new UID.

## Cross Provider Restore
To perform a restore on a cluster using different storage classes create a yaml file based upon the following:

Expand Down
86 changes: 86 additions & 0 deletions hack/restoreOwnerReferences.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
#!/bin/bash

# List of resources to check
RESOURCES=("Fulcio" "Rekor" "Trillian" "TimestampAuthority" "CTlog" "Tuf")


function validate_owner() {
local RESOURCE=$1
local ITEM=$2
local OWNER_NAME=$3

# Check all the labels exist and are the same
LABELS=("app.kubernetes.io/instance" "app.kubernetes.io/part-of" "velero.io/backup-name" "velero.io/restore-name")
for LABEL in "${LABELS[@]}"; do
PARENT_LABEL=$(oc get Securesign "$OWNER_NAME" -o json | jq -r ".metadata.labels[\"$LABEL\"]")
CHILD_LABEL=$(oc get $RESOURCE "$ITEM" -o json | jq -r ".metadata.labels[\"$LABEL\"]")

if [[ -z "$CHILD_LABEL" || $CHILD_LABEL == "null" ]]; then
echo " $LABEL label missing in $RESOURCE"
return 1
elif [[ -z "$PARENT_LABEL" || $PARENT_LABEL == "null" ]]; then
echo " $LABEL label missing in Securesign"
return 1
elif [[ "$CHILD_LABEL" != "$PARENT_LABEL" ]]; then
echo " $LABEL labels not matching: $CHILD_LABEL != $PARENT_LABEL"
return 1
fi
done

return 0
}


for RESOURCE in "${RESOURCES[@]}"; do
echo "Checking $RESOURCE ..."

# Get all resources missing ownerReferences
MISSING_REFS=$(oc get $RESOURCE -o json | jq -r '.items[] | select(.metadata.ownerReferences == null) | .metadata.name')

for ITEM in $MISSING_REFS; do
echo " Missing ownerReferences in $RESOURCE/$ITEM"

# Find the expected owner based on labels
OWNER_NAME=$(oc get $RESOURCE "$ITEM" -o json | jq -r '.metadata.labels["app.kubernetes.io/name"]')

if [[ -z "$OWNER_NAME" || "$OWNER_NAME" == "null" ]]; then
echo " Skipping $RESOURCE/$ITEM: name not found in labels"
continue
fi

if ! validate_owner $RESOURCE $ITEM $OWNER_NAME; then
echo " Skipping ..."
continue
fi

# Try to get the owner's UID from Securesign
OWNER_UID=$(oc get Securesign "$OWNER_NAME" -o jsonpath='{.metadata.uid}' 2>/dev/null)

if [[ -z "$OWNER_UID" || "$OWNER_UID" == "null" ]]; then
echo " Failed to find Securesign/$OWNER_NAME UID, skipping ..."
continue
fi

echo " Found owner: Securesign/$OWNER_NAME (UID: $OWNER_UID)"

# Patch the object with the restored ownerReference
oc patch $RESOURCE "$ITEM" --type='merge' -p "{
\"metadata\": {
\"ownerReferences\": [
{
\"apiVersion\": \"rhtas.redhat.com/v1alpha1\",
\"kind\": \"Securesign\",
\"name\": \"$OWNER_NAME\",
\"uid\": \"$OWNER_UID\",
\"controller\": true,
\"blockOwnerDeletion\": true
}
]
}
}"

echo "Restored ownerReferences for $RESOURCE/$ITEM"
done
done

echo "Done"
Loading