Troubleshooting resources
apiVersion: v1
kind: Pod
metadata:
name: netshoot-unprivileged-meshed
namespace: cloud-platform
labels:
app: netshoot-unprivileged-meshed
istio.io/rev: default
spec:
containers:
- name: netshoot-unprivileged
resources:
limits:
cpu: 50m
memory: 50Mi
requests:
cpu: 50m
memory: 50Mi
image: eu.gcr.io/ems-gap-images/netshoot-unprivileged:latest
command: ["/bin/sleep", "3650d"]
imagePullPolicy: IfNotPresent
securityContext:
runAsUser: 1000
restartPolicy: Always
- take a look at node resource usage (including bandwidth)
- overall node resource trend
- podwise resource trend
- calico-node runs and not throttling
- check logs for obvoius errors
- netd runs and not throttling
- check logs for obvoius errors
- kube-dns runs and not throttling
- check logs for obvoius errors
- check basic communication
- from node (one-shot pod or another running workload) (telnet/curl/nc -vz)
- to node (one-shot pod or another running workload) (telnet/curl/nc -vz)
- symptoms
- connection reset
- indicates that the other side is cutting an existing connection
- if sporadic rule out keepalive timeout
- if reproducible
- connection refused
- indicates that the other side is not even accepting the connection
- no route to host
- likely you are doing something wrong check source/dst ips if they are for active resources
- timeout
- indicates dropped packet on the firewall level or bad routing
- rule out network policies