Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the "helm.sh/resource-policy" annotation causes automated rollbacks to consistently fail the first time they are attempted #13142

Open
gibsondan opened this issue Jun 27, 2024 · 1 comment
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@gibsondan
Copy link

We have a secret defined as follows with the "helm.sh/resource-policy" annotation, so that Helm leaves it around even after it would normally remove it:

apiVersion: v1
kind: Secret
metadata:
  name: my-secret-{{ .Values.version }}
  labels:
    app.kubernetes.io/managed-by: {{ .Release.Service }}
  annotations:
    "helm.sh/resource-policy": keep

We are finding that the presence of this annotation reliably prevents automated rollbacks from working. If a command like

helm upgrade our-cloud . --atomic --install --debug --timeout 5m --values ./values.yaml

times out and the --atomic flag causes it to try to roll back, we see the following error about being unable to find the secret in our logs, 100% of the time (even though we are certain that the secret does in fact exist - it was never removed from the previous Helm deploy due to the annotation, so it may no longer be in Helm's known list of secrets, but it does exist in the cluster):

wait.go:50: [debug] beginning wait for 43 resources with timeout of 5s
...
upgrade.go:476: [debug] warning: Upgrade "our-cloud" failed: context deadline exceeded
upgrade.go:494: [debug] Upgrade failed and atomic is set, rolling back to last successful release
history.go:56: [debug] getting history for release our-cloud
rollback.go:65: [debug] preparing rollback of our-cloud
rollback.go:131: [debug] rolling back our-cloud (current: v6292, target: v6291)
rollback.go:72: [debug] creating rolled back release for our-cloud
rollback.go:78: [debug] performing rollback of our-cloud
client.go:393: [debug] checking 43 resources for changes
...
rollback.go:195: [debug] warning: Rollback "our-cloud" failed: no Secret with the name "my-secret-780bdef1-9ba1f795" found
Error: UPGRADE FAILED: an error occurred while rolling back the release. original upgrade error: context deadline exceeded: no Secret with the name "my-secret-780bdef1-9ba1f795" found
helm.go:84: [debug] no Secret with the name "my-secret-780bdef1-9ba1f795" found

This happens 100% of the time if we have the "helm.sh/resource-policy" set on that secret, but never happens if we remove the annotation. Interestingly it also goes away if we then attempt a manual rollback to the same revision that the automated rollback attempted - so its only the first time that the rollback is attempted to a given revision that it fails with this "secret not found" error.

Similarly, if we disable --atomic but keep --wait and --timeout so that the upgrade fails without automatically rolling back, the first manual rollback that we attempt to the previous revision will fail with the same "secret not found" error, but then rolling back to that revision again will succeed, without us making any changes.

It is possible that this is the same underlying issue as #12436 - maybe the annotation causes helm to treat this secret similarly to a resource that was created manually.

Output of helm version: 3.15.2

Output of kubectl version:

Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.4-eks-036c24b

Cloud Provider/Platform (AKS, GKE, Minikube etc.):

@mattfarina mattfarina added the bug Categorizes issue or PR as related to a bug. label Jul 1, 2024
@mattfarina
Copy link
Collaborator

I've not tried to reproduce it, yet. I've looked over the code, quickly, and I don't see what would be causing this. Marking it as a bug. The next step is to try to reproduce it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
2 participants