Common errors
Search for the stringerrorin the build output. The sentence after it usually indicates the problem.
Usually you will see something like this in the log:
Waiting for deployment "something-web" rollout to finish: 1 out of 2 new replicas have been updated...
# other lines
Error: Command failed with exit code 1: sh -c kubectl rollout status -f /tmp/tmp-116-4Nbfe65kW3Nt/something-web-Deployment.yaml --timeout=300s
Step #5 - "Deploy": error: timed out waiting for the condition
Clues
- The last line states that the 5th step (the deployment step) timed out.
- The line above states the command ran, which also contains
timeout=300. - The first line in the example shows that the build is trying to create the first of the two pods and waiting.
Verdict
All these are saying that your pod could not be started in 300 seconds. Technically this means that your liveness and/or readiness probes (aka your /healthcheck by default) did not return 200 in the given timeframe.
The typical causes of this issue
- The container cannot start, because of a missing dependency, faulty code, etc
- Some new environment variable was not set which is used by the code
- Some external dependency is not reachable (db, redis etc.)
For more, in-depth information see the official documentation on the pod lifecycle.
When a deployment fails we always roll back any changes made by the current deployment.
Check and ensure that your application can shut down gracefully to a shutdown signal.
By default the generated configuration will contain default liveness and readiness checks assuming that your application has a /healthcheck endpoint as per the Emarsys standards.
This scenario would happen when running a one-off pod manually with kubectl run, and the image not having a user specified. According to Kubernetes that pod will be considered to want to have root priviliges which goes against our cluster policies.
The solution is to either specify the user in the dockerfile, or if not possible, to use the below override flag for the kubectl run command to add the necessary user.
--overrides='{
"spec": {
"securityContext": {
"runAsUser": 1000,
"fsGroup": 1000
},
},
}'
See also Non-numeric user issue
First, perform the steps to fix the ${NPM_TOKEN} issue below and check if the token is properly picked up by npm.
If it is not, it’s likely that the npmrc in the build doesn’t properly contain this environment variable. You can fix this in several ways:
adding a
.npmrcfile to the root of your repository with the following line://registry.npmjs.org/:_authToken=${NPM_TOKEN}