Fix issue #899: Pod status out of sync after being marked as not ready by controller manager

As described in the issue, if the following sequence happens, we fail to properly
update the pod status in api server:

1. Create pod in k8s
2. Provider creates the pod and syncs its status back
3. Pod in k8s ready/running, all fine
4. Virtual kubelet fails to update node status for some time for whatever reason (e.g. network connectivity issues)
5. Virtual node marked as NotReady with message: Kubelet stopped posting node status
6. kube-controller-manager of k8s, goes and marks all pods as Ready = false:
7. Virtual kubelet never sync's status of pod in provider back to k8s
This commit is contained in:
Sargun Dhillon
2020-12-07 16:50:00 -08:00
parent 0d1f6f1625
commit de7f7dd173
2 changed files with 30 additions and 3 deletions

View File

@@ -214,11 +214,15 @@ func (pc *PodController) updatePodStatus(ctx context.Context, podFromKubernetes
}
kPod := obj.(*knownPod)
kPod.Lock()
podFromProvider := kPod.lastPodStatusReceivedFromProvider.DeepCopy()
kPod.Unlock()
if cmp.Equal(podFromKubernetes.Status, podFromProvider.Status) && podFromProvider.DeletionTimestamp == nil {
kPod.lastPodStatusUpdateSkipped = true
kPod.Unlock()
return nil
}
kPod.lastPodStatusUpdateSkipped = false
kPod.Unlock()
// Pod deleted by provider due some reasons. e.g. a K8s provider, pod created by deployment would be evicted when node is not ready.
// If we do not delete pod in K8s, deployment would not create a new one.
if podFromProvider.DeletionTimestamp != nil && podFromKubernetes.DeletionTimestamp == nil {