맥락
CRI-O 런타임 베이스로 설치된 Kubernetes Cluster 에서 CRI-O systemd 를 재기동하면 Pod 에 영향이 없는지 체크해보고자 한다.
실험 #1
내용
- Worker Node 에서 다음 명령 수행
- sudo systemctl restart crio
- 동시에, Worker Node 에서 다음 명령으로 컨테이너 CREATED 또는 STATUS 값이 변경되었는지 확인
- sudo watch crictl ps
가설
- Pod 가 재기동되었다면, crictl ps 에서 CREATED 또는 STATUS 값이 변경될 것이다.
- Pod 재기동 되지 않았다면, crictl ps 에서는 변화가 없을 것이다.
결과
- systemctl restart 전
$ sudo crictl ps | grep calico-node afb1912fd2085 042163432abcec06b8077b24973b223a5f4cfdb35d85c3816f5d07a13d51afae 2 weeks ago Running calico-node 2 82cc5d7557702 calico-node-dtdxv
- restart 후
$ sudo crictl ps | grep calico-node afb1912fd2085 042163432abcec06b8077b24973b223a5f4cfdb35d85c3816f5d07a13d51afae 2 weeks ago Running calico-node 2 82cc5d7557702 calico-node-dtdxv
CREATED 값, STATE 값 모두 변경 없음.
결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!
실험 #2
내용
- Worker Node 에서 다음 명령 수행
- sudo systemctl restart crio
- 동시에, Worker Node 에서 다음 명령으로 컨테이너를 동작시키는 PID 가 변경되는지 확인
- ps -aux | grep calico-node
가설
- Pod 가 재기동되었다면, ps -aux | grep calico-node 에서 PID 가 변경될 것이다.
- Pod 재기동 되지 않았다면, ps -aux | grep calico-node 에서는 변화가 없을 것이다
결과
- systemctl restart 전
$ ps -aux | grep calico-node root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90400 ? Sl Apr09 145:38 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520189 0.0 0.0 6408 2432 pts/0 S+ 16:54 0:00 grep --color=auto calico-node
- restart 후
$ ps -aux | grep calico-node root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:38 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520314 0.0 0.0 6408 2432 pts/0 S+ 16:55 0:00 grep --color=auto calico-node
PID 값 변경 없음.
결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!
실험 #3
내용
- Worker Node 에서 다음 명령 수행
- sudo systemctl stop crio
- 10초 뒤, 20초 뒤, 30초 뒤, Worker Node 에서 다음 명령으로 컨테이너를 동작시키는 PID 가 변경되는지 확인
- ps -aux | grep calico-node
가설
10초 뒤, 20초 뒤, 30초 뒤에...
- Pod 가 재기동되었다면, ps -aux | grep calico-node 에서 PID 가 변경될 것이다.
- Pod 재기동 되지 않았다면, ps -aux | grep calico-node 에서는 변화가 없을 것이다
결과
- systemctl stop 전
$ ps -aux | grep calico-node root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:40 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520614 0.0 0.0 6408 2432 pts/0 S+ 16:59 0:00 grep --color=auto calico-node
- 10초 뒤
root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:41 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520722 0.0 0.0 6408 2176 pts/0 S+ 17:00 0:00 grep calico-node
- 20초 뒤
root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:41 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520730 0.0 0.0 6408 2176 pts/0 S+ 17:00 0:00 grep calico-node
- 30초 뒤
root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:41 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520744 0.0 0.0 6408 2176 pts/0 S+ 17:01 0:00 grep calico-node
- systemctl restart 후
$ ps -aux | grep calico-node root 1838 0.0 0.0 2260 1152 ? Ss Apr09 0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s root 2043 0.6 0.5 2314580 90528 ? Sl Apr09 145:41 calico-node -felix root 2046 0.0 0.4 2240080 75944 ? Sl Apr09 2:37 calico-node -confd root 2047 0.0 0.4 1871420 71200 ? Sl Apr09 2:34 calico-node -status-reporter root 2049 0.0 0.4 1945408 75668 ? Sl Apr09 3:01 calico-node -monitor-addresses root 2054 0.0 0.4 1871420 78492 ? Sl Apr09 2:33 calico-node -allocate-tunnel-addrs root 2066 0.0 0.4 1871164 71640 ? Sl Apr09 1:39 calico-node -monitor-token rocky 1520855 0.0 0.0 6408 2432 pts/0 S+ 17:01 0:00 grep --color=auto calico-node
PID 값 변경 없음.
결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!
정말로? 서비스에 영향이 1도 없나?
실험 #4
내용
- Worker Node 에서 다음 명령 수행
- sudo systemctl restart crio
- 다음 명령으로 실행된 터미널이 종료되는지 확인
- $ kubectl exec -it calico-node-dtdxv -n calico-system -- bash
가설
- Pod 에 영향이 있다면, 터미널 연결이 끊길 것이다.
- Pod 에 영향이 아예 없다면, 터미널 연결이 끊기지 않을 것이다.
결과
systemctl restart crio 전
[root@worker-node-01 /]# ...
systemctl restart crio 후
[root@worker-node-01 /]# ... error: Internal error occurred: Internal error occurred: error executing command in container: context canceled
결론: Pod 에는 영향이 아예 없는것은 아닌듯 보인다.
회고
참 재미있었다.
'kubernetes' 카테고리의 다른 글
emptyDir 을 sizeLimit 초과사용시 kubelet 코드상 어디에서 축출시킬까? (0) | 2025.01.16 |
---|---|
Nginx Ingress - Internal/External IngressClass를 분리해보자 (0) | 2024.08.11 |
Grafana Dashboard 샘플 모음 (0) | 2024.08.04 |