Bug #2008644 “error from create_kubeconfig when kubectl snap is …” : Bugs : Kubernetes Control Plane Charm

After a recent “juju config k-c-p channel=1.27/edge”, i noticed an error in my non-leader k-c-p unit (log with context attached):

——
2023-02-17 23:07:23 INFO unit.kubernetes-control-plane/0.juju-log server.go:316 Writing kubeconfig file.
2023-02-17 23:07:23 ERROR unit.kubernetes-control-plane/0.juju-log server.go:316 Hook error:

FileNotFoundError: [Errno 2] No such file or directory: ‘kubectl’
——

It looks like we were trying to build a kubeconfig in setup_non_leader_authentication when the kubectl snap was being refreshed. I say that because kubectl refreshed at the same time as the error on that system (23:07); from “snap info kubectl”:

——
commands:
  – kubectl
snap-id: ZgG2URycDgvxSVskfoZxn44uaRMw0iwe
tracking: 1.27/edge
refresh-date: 9 days ago, at 23:07 UTC
——

This is also the same time that snapd reported no snap updates were available:

——
/var/log/syslog.2.gz:Feb 17 23:07:02 juju-089b1c-5 systemd[2398569]: Listening on REST API socket for snapd user session agent.
/var/log/syslog.2.gz:Feb 17 23:07:26 juju-089b1c-5 snapd[38224]: storehelpers.go:769: cannot refresh snap “kubectl”: snap has no updates available
/var/log/syslog.2.gz:Feb 17 23:08:46 juju-089b1c-5 systemd[2398569]: Closed REST API socket for snapd user session agent.
——

It appears even though there was no update, kubectl was unavailable at 23:07:23. Perhaps the ‘current’ link or snap mount goes away briefly during a refresh?

We should find a way to make kubectl calls from k-c-p more resilient.

Read more here: Source link