How to solve certificate expired issue in Rancher OS
Problem
The rancher web application UI became inaccessible. Investigation into the logs shows that the certificate has expired
time="2021-12-29T08:27:32.616638402Z" level=info msg="Waiting for master node startup: resource name may not be empty" 2021-12-29 08:27:32.985756 I | http: TLS handshake error from 127.0.0.1:35568: remote error: tls: bad certificate time="2021-12-29T08:27:32.987398748Z" level=error msg="server https://127.0.0.1:6443/cacerts is not trusted: Get https://127.0.0.1:6443/cacerts: x509: certificate has expired or is not yet valid" 2021-12-29 08:27:32.987447 I | http: TLS handshake error from 127.0.0.1:35570: remote error: tls: bad certificate time="2021-12-29T08:27:33.620623487Z" level=info msg="Waiting for master node startup: resource name may not be empty" 2021/12/29 08:27:34 [INFO] Waiting for server to become available: Get "https://127.0.0.1:6443/version?timeout=15m0s": x509: certificate has expired or is not yet valid: current time 2021-12-29T08:27:34Z is after 2021-12-28T06:35:39
Solution
- SSH into the RancherOS Server on AWS
$ ssh -i /path/to/secure/key.pem rancher@<rancher-domain>
- Check that the date and time on the server is correct. Change to correct date if incorrect
[rancher@<rancher-domain> ~]$ date
Fri Feb 11 17:32:28 UTC 2022
- If the date shown is incorrect, then change the date to the correct date by using the following command
[rancher@<rancher-domain> ~]$ sudo date --set="2022-02-11 17:32:28"
- Ensure that the rancher docker image is running
[rancher@<rancher-domain> ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
480a9e0df709 rancher/rancher:latest "entrypoint.sh" 1 month ago Up 6 hours 0.0.0.0:80->80/tcp, 0.0.0.0:443->443/tcp rancher
- Login to the rancher container (using container id from #3)
[rancher@<rancher-domain> ~]$ docker exec -it rancher /bin/sh
- Delete the old certificates by running the following commands
sh-4.4# kubectl --insecure-skip-tls-verify -n kube-system delete secrets k3s-serving
sh-4.4# kubectl --insecure-skip-tls-verify delete secret serving-cert -n cattle-system
sh-4.4# rm -f /var/lib/rancher/k3s/server/tls/dynamic-cert.json
- Refresh the rancher certificates
sh-4.4# curl --insecure -sfL https://localhost:8443/v3
- Exit the docker container
sh-4.4# exit
exit
- Restart the docker container
[rancher@<rancher-domain> ~]$ docker restart rancher
Once the container is restarted, then the web interface should become available again.