Skip to content

Troubleshooting📜

Big Bang can take a long time to run. After making changes, it could take 10-15 minutes to take effect. Use the sync.sh script to speed this up.

Big Bang is configured to retry failed package installations and upgrades. Before concluding you have a failure, make sure you allow Big Bang to attempt to resolve dependencies and retry.

Iron Bank Authentication📜

Symptom Cause Resolution
Despite entering correct credentials, get unauthorized: authentication required from Iron Bank. Using a non-robot account with an expired token. Login with the non-robot account manually at registry1.dso.mil, then retry. For production, contact the Iron Bank team to obtain a robot account and update pull credentials to use it in your environment.

Flux Install📜

Helpful debugging commands:

# Get the status
kubectl get pods -n flux-system

# Get the logs
kubectl get events -n flux-system
Symptom Cause Resolution
Install script timed and pods are still pulling the image Slow connection to docker registry Adjust --timeout value in flux install to wait longer
Pod status is ImagePullBackOff or ErrImagePull Bad registry, version, or credentials Fix the --registry, --version, or --image-pull secret options or use the ./scripts/install_flux.sh script for pulling from Iron Bank

Git Repository📜

Helpful debugging commands:

# Get the status
kubectl get gitrepositories -A

# Get the logs
kubectl get events --field-selector involvedObject.kind=GitRepository -A
Symptom Cause Resolution
unable to clone ... error: authentication required Pull credentials for Git invalid or not provided Add credentials to a Secret and reference it in GitRepository.spec.secretRef.name. If possible, encrypt the secret and include it in the Kustomization deployment for your environment.
auth secret error: Secret ... not found GitRepository is trying to use credentials but cannot find the Secret Make sure the secret exists and is in the same namespace as the GitRepository resource. If possible, encrypt the secret and include it in the Kustomization deployment for your environment.
unable to clone ... error: repository not found Invalid Git url Fix url for Git repository and redeploy
unable to clone ... error: couldn't find remote ref Invalid branch or tag Fix branch or tag for Git repository and redeploy

ConfigMap or Secrets📜

Symptom Cause Resolution
ConfigMap or Secret does not exist GitRepository or Kustomization failed. Namespace was incorrect. Use GitRepository and Kustomization sections to troubleshoot. Use kubectl get secrets,configmaps -A to verify resource was not in the wrong Namespace.

Helm Release📜

Helpful debugging commands:

# Get the status
kubectl get hr -A

# Get the logs
kubectl get events --field-selector involvedObject.kind=HelmRelease -A

# Describe the HelmRelease to get more information
kubectl describe hr <NAME> -n bigbang

# Get all logs/events for a specific HelmRelease object
flux logs --kind=HelmRelease --namespace bigbang --name <NAME>
Symptom Cause Resolution
Reconciliation in Progress This is normal and indicates flux is currently applying updates Wait
dependency ... is not ready This is normal and indicates flux is currently waiting on another resource to complete Wait
Error: YAML parse error on ... Syntax error in helm chart Use helm template to narrow down the problem. Fix it and commit to Git
Helm install failed: failed to create resource ... unable to create new content in namespace because it is being terminated This seems to happen when a re-deploy of Big Bang occurs to early after a Big Bang delete. Try to remove the namespace using kubectl get ns <stuck namespace> -o json | jq '.spec.finalizers = []' | kubectl replace --raw "/api/v1/namespaces/$NS/finalize" -f. If this does not work, a cluster restart may be necessary.
Error: failed to download ... Path to Helm chart is incorrect Find the HelmRelease configuration and update spec.path to the correct path of the helm chart
Helm uninstall failed: uninstall: Release not loaded: ____: release: not found Helm install failed because of an error and a rollback/uninstall is attempted but release has not been installed. Describe the HelmRelease in question or use flux to get the logs to get more info abut why it failed to install.
reconciliation failed: Helm rollback failed: an error occurred while cleaning up resources. original rollback error: no XXXX with the name "XXXX" found: unable to cleanup resources: object not found, skipping delete This error happens when an upgrade fails and flux attempts a rollback but there are templates that have been renamed/removed. Describe the HelmRelease in question or use flux to get the logs to get more info abut why exactly the upgrade failed.

Kustomization📜

Helpful debugging commands:

# Get the status
kubectl get kustomizations -A

# Get the logs
kubectl get events --field-selector involvedObject.kind=Kustomization -A
Symptom Cause Resolution
kustomization path not found spec.path in Kustomization resource in is incorrect Fix spec.path and redeploy
Source not found spec.sourceRef in Kustomization resource is incorrect Fix spec.sourceRef to point to repository resource and redeploy
decryption secret error: Secret ... not found SOPS private key secret is missing or misconfigured Check decryption settings in the Kustomization resource to make sure secretRef is pointing to the correct secret. Make sure the Secret holding the private key is deployed in the cluster.
kustomize build failed: json: unknown field There is a syntax error with the kustomization files. Use kustomize build on the <env> folder or base folder to narrow down the problem. Fix the error and push to Git.
evalsymlink failure ... no such file or directory A reference to a file in kustomization.yaml is incorrect Use kustomize build on the <env> folder or base folder to narrow down the problem. Fix the error and push to Git.
Error: accumulating resources ... A reference to a base is incorrect Use kustomize build on the <env> folder or base folder to narrow down the problem.Review the bases: section for correct paths to find the error. Fix the error and push to Git.
Error fetchingref: fatal: couldn't find remote ref ... The branch, tag, or sha used for a remote base is incorrect Use kustomize build on the <env> folder or base folder to narrow down the problem. It is likely the remote reference to the Big Bang’s Kustomize in the base folder. Review the bases: section for correct paths to find the error. Fix the error and push to Git.
Error: merging from generator ... Kustomize is trying to merge with a resource that is non-existent. This is usually due to naming the merging ConfigMap or Secret incorrectly compared to a base ConfigMap or Secret. Use kustomize build on the <env> folder or base folder to narrow down the problem. Look for the keyword merge in the kustomization.yaml files and verify the name is correctly set.

Packages📜

Helpful debugging commands:

# Get the status
kubectl get deployments,po -n <namespace of package>

# Get the logs
kubectl get events --field-selector involvedObject.kind=Deployment -n <namespace of package>
kubectl get events --field-selector involvedObject.kind=Pod -n <namespace of package>

Last update: 2024-07-30 by Michael Martin