This article will describe how you can overcome a potential CrashLoop
state that the LINSTOR controller pod might get into after upgrading a LINSTOR deployment within Kubernetes. When there are a large number of LINSTOR resources in your deployment and your LINSTOR deployment uses the k8s
back-end database, this can sometimes happen.
Symptoms
After upgrading LINSTOR, you might find the linstor-controller
pod in a CrashLoop
state because the run-migration
container failed. If you then examine the container logs, they will show that the container tried to create a backup of the database but could not store the backup in the k8s
back end. The error will be something such as “Request entity too large”.
Resolution
To resolve this situation, take the following steps:
- Manually create an external backup of the database. You can do that by entering the following commands which will create many YAML files:
kubectl api-resources --api-group internal.linstor.linbit.com -oname | xargs kubectl get crds -oyaml > crds.yaml
kubectl api-resources --api-group internal.linstor.linbit.com -oname | xargs -I {} sh -c "kubectl get {} -oyaml > {}.yaml"
- Create a secret to indicate to the
run-migration
container that it is safe to continue:kubectl create secret generic linstor-backup-for-<linstor-controller-pod-name>
- Wait until the
run-migration
container restarts.
Verifying successful resolution
To verify that these steps have resolved the LINSTOR controller pod CrashLoop
state, you can watch the rollout status of the pod and verify that it is up and running by entering the following command:
kubectl rollout status deploy/linstor-controller -w
Written by MW, 2024/10/23.
Edited by MAT, 2024/10/30.