最近在测试istio时,经常发现注入过sidecar的pod过段时间就变成了Init:CrashLoopBackOff状态。如:
1 2 3 |
[root@sh-saas-k8s1-master-dev-01 ~]# kubectl get pod --all-namespaces -o wide | grep 'Init' public-ops-tomcat-dev public-ops-dubbo-demo-web-tomcat-dev-79f758dcf-64qwr 0/2 Init:CrashLoopBackOff 7 21h 10.253.3.166 10.12.97.23 <none> <none> |
我的kubernetes版本为1.14.10,istio版本为:1.5.1
查看istio-init容器的日志,发现有如下的报错:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
[root@sh-saas-k8s1-master-dev-01 ~]# kubectl logs -n public-ops-tomcat-dev public-ops-dubbo-demo-web-tomcat-dev-79f758dcf-64qwr istio-init Environment: ------------ ENVOY_PORT= INBOUND_CAPTURE_PORT= ISTIO_INBOUND_INTERCEPTION_MODE= ISTIO_INBOUND_TPROXY_MARK= ISTIO_INBOUND_TPROXY_ROUTE_TABLE= ISTIO_INBOUND_PORTS= ISTIO_LOCAL_EXCLUDE_PORTS= ISTIO_SERVICE_CIDR= ISTIO_SERVICE_EXCLUDE_CIDR= Variables: ---------- PROXY_PORT=15001 PROXY_INBOUND_CAPTURE_PORT=15006 PROXY_UID=1337 PROXY_GID=1337 INBOUND_INTERCEPTION_MODE=REDIRECT INBOUND_TPROXY_MARK=1337 INBOUND_TPROXY_ROUTE_TABLE=133 INBOUND_PORTS_INCLUDE=* INBOUND_PORTS_EXCLUDE=15090,15020 OUTBOUND_IP_RANGES_INCLUDE=* OUTBOUND_IP_RANGES_EXCLUDE= OUTBOUND_PORTS_EXCLUDE= KUBEVIRT_INTERFACES= ENABLE_INBOUND_IPV6=false Writing following contents to rules file: /tmp/iptables-rules-1588923880490327697.txt562915423 * nat -N ISTIO_REDIRECT -N ISTIO_IN_REDIRECT -N ISTIO_INBOUND -N ISTIO_OUTPUT -A ISTIO_REDIRECT -p tcp -j REDIRECT --to-port 15001 -A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-port 15006 -A PREROUTING -p tcp -j ISTIO_INBOUND -A ISTIO_INBOUND -p tcp --dport 22 -j RETURN -A ISTIO_INBOUND -p tcp --dport 15090 -j RETURN -A ISTIO_INBOUND -p tcp --dport 15020 -j RETURN -A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT -A OUTPUT -p tcp -j ISTIO_OUTPUT -A ISTIO_OUTPUT -o lo -s 127.0.0.6/32 -j RETURN -A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT -A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN -A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN -A ISTIO_OUTPUT -o lo ! -d 127.0.0.1/32 -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT -A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN -A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN -A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN -A ISTIO_OUTPUT -j ISTIO_REDIRECT COMMIT iptables-restore --noflush /tmp/iptables-rules-1588923880490327697.txt562915423 iptables-restore: line 2 failed iptables-save # Generated by iptables-save v1.6.1 on Fri May 8 07:44:40 2020 *mangle :PREROUTING ACCEPT [643414:2344563772] :INPUT ACCEPT [643414:2344563772] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [616124:4267707048] :POSTROUTING ACCEPT [616124:4267707048] COMMIT # Completed on Fri May 8 07:44:40 2020 # Generated by iptables-save v1.6.1 on Fri May 8 07:44:40 2020 *raw :PREROUTING ACCEPT [643414:2344563772] :OUTPUT ACCEPT [616124:4267707048] COMMIT # Completed on Fri May 8 07:44:40 2020 # Generated by iptables-save v1.6.1 on Fri May 8 07:44:40 2020 *nat :PREROUTING ACCEPT [38474:2000648] :INPUT ACCEPT [40999:2131948] :OUTPUT ACCEPT [7987:560379] :POSTROUTING ACCEPT [8763:600731] :ISTIO_INBOUND - [0:0] :ISTIO_IN_REDIRECT - [0:0] :ISTIO_OUTPUT - [0:0] :ISTIO_REDIRECT - [0:0] -A PREROUTING -p tcp -j ISTIO_INBOUND -A OUTPUT -p tcp -j ISTIO_OUTPUT -A ISTIO_INBOUND -p tcp -m tcp --dport 22 -j RETURN -A ISTIO_INBOUND -p tcp -m tcp --dport 15090 -j RETURN -A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN -A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT -A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006 -A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN -A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT -A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN -A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN -A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT -A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN -A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN -A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN -A ISTIO_OUTPUT -j ISTIO_REDIRECT -A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001 COMMIT # Completed on Fri May 8 07:44:40 2020 # Generated by iptables-save v1.6.1 on Fri May 8 07:44:40 2020 *filter :INPUT ACCEPT [643414:2344563772] :FORWARD ACCEPT [0:0] :OUTPUT ACCEPT [616124:4267707048] COMMIT # Completed on Fri May 8 07:44:40 2020 panic: exit status 1 goroutine 1 [running]: istio.io/istio/tools/istio-iptables/pkg/dependencies.(*RealDependencies).RunOrFail(0xd819c0, 0x9739b8, 0x10, 0xc00000cbc0, 0x2, 0x2) istio.io/istio@/tools/istio-iptables/pkg/dependencies/implementation.go:44 +0x96 istio.io/istio/tools/istio-iptables/pkg/cmd.(*IptablesConfigurator).executeIptablesRestoreCommand(0xc000109d30, 0x7faeecd9a001, 0x0, 0x0) istio.io/istio@/tools/istio-iptables/pkg/cmd/run.go:474 +0x3aa istio.io/istio/tools/istio-iptables/pkg/cmd.(*IptablesConfigurator).executeCommands(0xc000109d30) istio.io/istio@/tools/istio-iptables/pkg/cmd/run.go:481 +0x45 istio.io/istio/tools/istio-iptables/pkg/cmd.(*IptablesConfigurator).run(0xc000109d30) istio.io/istio@/tools/istio-iptables/pkg/cmd/run.go:428 +0x24e2 istio.io/istio/tools/istio-iptables/pkg/cmd.glob..func1(0xd5c740, 0xc0000ee700, 0x0, 0x10) istio.io/istio@/tools/istio-iptables/pkg/cmd/root.go:56 +0x14e github.com/spf13/cobra.(*Command).execute(0xd5c740, 0xc00001e130, 0x10, 0x11, 0xd5c740, 0xc00001e130) github.com/spf13/cobra@v0.0.5/command.go:830 +0x2aa github.com/spf13/cobra.(*Command).ExecuteC(0xd5c740, 0x40574f, 0xc00009e058, 0x0) github.com/spf13/cobra@v0.0.5/command.go:914 +0x2fb github.com/spf13/cobra.(*Command).Execute(...) github.com/spf13/cobra@v0.0.5/command.go:864 istio.io/istio/tools/istio-iptables/pkg/cmd.Execute() istio.io/istio@/tools/istio-iptables/pkg/cmd/root.go:284 +0x2d main.main() istio.io/istio@/tools/istio-iptables/main.go:22 +0x20 |
从报错来看,istio-init容器重启了,并且在重启时,执行iptables报错。这就奇怪了,正常来说,init容器是执行完成后就结束,并不会再次执行的。
把pod删除,让其重新创建一个新的pod,发现是可以正常启动的。
Pod 重启,会导致 Init 容器重新执行,主要有如下几个原因:
- 用户更新 PodSpec 导致 Init 容器镜像发生改变。应用容器镜像的变更只会重启应用容器。
- Pod 基础设施容器被重启。这不多见,但某些具有 root 权限可访问 Node 的人可能会这样做。
- 当 restartPolicy 设置为 Always,Pod 中所有容器会终止,强制重启,由于垃圾收集导致 Init 容器完整的记录丢失。
基于以上原因,怀疑是我定时清理docker磁盘导致的。我定时清理方式如下:
1 2 3 |
[root@sh-saas-k8s1-node-dev-03 ~]# crontab -l 00 05 * * 4 docker system prune -a -f |
于是手动又执行了一次docker system prune -a -f,发现果然istio-init容器又重启报错了。
网上查了一下,发现已经有人提过issue了.
https://github.com/istio/istio/issues/19717
https://github.com/kubernetes/kubernetes/issues/67261
发生的原因主要是init容器是执行完后就退出的,也就是是一个停止的容器。
1 2 3 4 |
[root@sh-saas-k8s1-node-dev-05 ~]# docker ps -a | grep init b23d4bfc0f52 82f719eb65c1 "istio-iptables -p 1…" 22 hours ago Exited (0) 22 hours ago k8s_istio-init_public-fe-zhan-node-dev-66dc985977-7n5rx_public-fe-node-dev_949d261c-904c-11ea-8278-5254001a47d3_0 33036e9212de 82f719eb65c1 "istio-iptables -p 1…" 22 hours ago Exited (0) 22 hours ago k8s_istio-init_public-fe-zhan-client-v2-node-dev-cbc9586cc-4wn5v_public-fe-node-dev_76e18eb4-904c-11ea-8278-5254001a47d3_0 |
执行docker system prune -a -f清理会把已经停止的容器清理掉。kubelet发现这个容器被清理掉后,又把这个init容器给重启了。目前看来还没有fix这个问题。
不过可以通过以下方式规避:
1 2 3 4 |
docker system prune -af --volumes --filter "label!=io.kubernetes.container.name=istio-init" #或: docker image prune -af |
非常感谢,这个问题已困扰我很久了,真心感谢