I’ve got kubelet message on our Kubernetes, the error message like below.
skipping pod synchronization - [PLEG is not healthy: pleg was last seen active 3m6.527452257s ago; threshold is 3m0s]
you can get the message from the rancher UI if using it.
data:image/s3,"s3://crabby-images/4417e/4417e807d0f96e49b886b3c700e48182d3b6737e" alt="how to fix PLEG problem in Kubernetes 1 image 1"
Solution
According to the IBM document, this issue is caused by slow interaction between kubelet and Docker. the solution is to increase house-keeping
interval, house-keeping is the kubelet evaluates eviction thresholds based on its configured housekeeping-interval
which defaults to 10s
.
Step
We are using RKE to deploy Kubernetes, so all components are running in the form of containers. use docker inspect kubelet
` to see the configuration. no house-keeping argument in the below screenshot.
if your kubelet running as a service, you can modify
/etc/systemd/system/kubelet.service
data:image/s3,"s3://crabby-images/16adb/16adb0fdc55897ea49cb61cde1bb627a1669917b" alt="how to fix PLEG problem in Kubernetes 2 image"
To update kubelet, add the house-keeping
in the cluster.yml
data:image/s3,"s3://crabby-images/536a8/536a833f9901794750c1149ca9687bbfb9082221" alt="how to fix PLEG problem in Kubernetes 3 image 2"
To update the argument, run the below command
rke up --config cluster.yml
data:image/s3,"s3://crabby-images/79968/79968de0a20240f9d540ddeb99bebce827851d99" alt="how to fix PLEG problem in Kubernetes 4 image 3"
Verify
use docker inspect command to check again, house-keeping is added after updating.
data:image/s3,"s3://crabby-images/c6353/c6353c2b20967f5a32425e7b45956db4574ba958" alt="how to fix PLEG problem in Kubernetes 5 image 4"
Conclusion
“PLGE is not healthy” can happen due to various causes, I believe there are many potential causes I have not run into it. yet. this post provides one of the solutions to fix.