본문 바로가기

kubernetes

Node 의 CRI-O 재기동시 Pod 에 영향이 없을까?

맥락

CRI-O 런타임 베이스로 설치된 Kubernetes Cluster 에서 CRI-O systemd 를 재기동하면 Pod 에 영향이 없는지 체크해보고자 한다.

실험 #1

내용

  1. Worker Node 에서 다음 명령 수행
    • sudo systemctl restart crio
  2. 동시에, Worker Node 에서 다음 명령으로 컨테이너 CREATED 또는 STATUS 값이 변경되었는지 확인
    • sudo watch crictl ps

가설

  • Pod 가 재기동되었다면, crictl ps 에서 CREATED 또는 STATUS 값이 변경될 것이다.
  • Pod 재기동 되지 않았다면, crictl ps 에서는 변화가 없을 것이다.

결과

  • systemctl restart 전
    $ sudo crictl ps | grep calico-node
    afb1912fd2085       042163432abcec06b8077b24973b223a5f4cfdb35d85c3816f5d07a13d51afae                                  2 weeks ago         Running             calico-node                 2                   82cc5d7557702       calico-node-dtdxv
  • restart 후
    $ sudo crictl ps | grep calico-node
    afb1912fd2085       042163432abcec06b8077b24973b223a5f4cfdb35d85c3816f5d07a13d51afae                                  2 weeks ago         Running             calico-node                 2                   82cc5d7557702       calico-node-dtdxv

CREATED 값, STATE 값 모두 변경 없음.

결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!

실험 #2

내용

  1. Worker Node 에서 다음 명령 수행
    • sudo systemctl restart crio
  2. 동시에, Worker Node 에서 다음 명령으로 컨테이너를 동작시키는 PID 가 변경되는지 확인
    • ps -aux | grep calico-node

가설

  • Pod 가 재기동되었다면, ps -aux | grep calico-node 에서 PID 가 변경될 것이다.
  • Pod 재기동 되지 않았다면, ps -aux | grep calico-node 에서는 변화가 없을 것이다

결과

  • systemctl restart 전
    $ ps -aux | grep calico-node
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90400 ?       Sl   Apr09 145:38 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520189  0.0  0.0   6408  2432 pts/0    S+   16:54   0:00 grep --color=auto calico-node
  • restart 후
    $ ps -aux | grep calico-node
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:38 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520314  0.0  0.0   6408  2432 pts/0    S+   16:55   0:00 grep --color=auto calico-node

PID 값 변경 없음.
결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!

실험 #3

내용

  1. Worker Node 에서 다음 명령 수행
    • sudo systemctl stop crio
  2. 10초 뒤, 20초 뒤, 30초 뒤, Worker Node 에서 다음 명령으로 컨테이너를 동작시키는 PID 가 변경되는지 확인
    • ps -aux | grep calico-node

가설

10초 뒤, 20초 뒤, 30초 뒤에...

  • Pod 가 재기동되었다면, ps -aux | grep calico-node 에서 PID 가 변경될 것이다.
  • Pod 재기동 되지 않았다면, ps -aux | grep calico-node 에서는 변화가 없을 것이다

결과

  • systemctl stop 전
    $ ps -aux | grep calico-node
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:40 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520614  0.0  0.0   6408  2432 pts/0    S+   16:59   0:00 grep --color=auto calico-node
  • 10초 뒤
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:41 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520722  0.0  0.0   6408  2176 pts/0    S+   17:00   0:00 grep calico-node
  • 20초 뒤
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:41 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520730  0.0  0.0   6408  2176 pts/0    S+   17:00   0:00 grep calico-node
  • 30초 뒤
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:41 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520744  0.0  0.0   6408  2176 pts/0    S+   17:01   0:00 grep calico-node
  • systemctl restart 후
    $ ps -aux | grep calico-node
    root        1838  0.0  0.0   2260  1152 ?        Ss   Apr09   0:01 /usr/libexec/crio/conmon -b /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -c afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f --exit-dir /var/run/crio/exits -l /var/log/pods/calico-system_calico-node-dtdxv_f64f12c5-b2fe-4591-bc01-2f1872e31db2/calico-node/2.log --log-level info -n k8s_calico-node_calico-node-dtdxv_calico-system_f64f12c5-b2fe-4591-bc01-2f1872e31db2_2 -P /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/conmon-pidfile -p /run/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata/pidfile --persist-dir /var/lib/containers/storage/overlay-containers/afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f/userdata -r /usr/libexec/crio/crun --runtime-arg --root=/run/crun --socket-dir-path /var/run/crio --syslog -u afb1912fd2085a64dd5606e530fe44fb4a7ae56f6d90f3044bbc799aa2e3736f -s
    root        2043  0.6  0.5 2314580 90528 ?       Sl   Apr09 145:41 calico-node -felix
    root        2046  0.0  0.4 2240080 75944 ?       Sl   Apr09   2:37 calico-node -confd
    root        2047  0.0  0.4 1871420 71200 ?       Sl   Apr09   2:34 calico-node -status-reporter
    root        2049  0.0  0.4 1945408 75668 ?       Sl   Apr09   3:01 calico-node -monitor-addresses
    root        2054  0.0  0.4 1871420 78492 ?       Sl   Apr09   2:33 calico-node -allocate-tunnel-addrs
    root        2066  0.0  0.4 1871164 71640 ?       Sl   Apr09   1:39 calico-node -monitor-token
    rocky    1520855  0.0  0.0   6408  2432 pts/0    S+   17:01   0:00 grep --color=auto calico-node

PID 값 변경 없음.
결론: sudo crictl restart crio 명령은 Pod 를 재기동시키지 않는다!

정말로? 서비스에 영향이 1도 없나?

실험 #4

내용

  1. Worker Node 에서 다음 명령 수행
    • sudo systemctl restart crio
  2. 다음 명령으로 실행된 터미널이 종료되는지 확인
    • $ kubectl exec -it calico-node-dtdxv -n calico-system -- bash

가설

  • Pod 에 영향이 있다면, 터미널 연결이 끊길 것이다.
  • Pod 에 영향이 아예 없다면, 터미널 연결이 끊기지 않을 것이다.

결과

  • systemctl restart crio 전

    [root@worker-node-01 /]# ...
  • systemctl restart crio 후

    [root@worker-node-01 /]# ...
    error: Internal error occurred: Internal error occurred: error executing command in container: context canceled

    결론: Pod 에는 영향이 아예 없는것은 아닌듯 보인다.

회고

참 재미있었다.