This refactors the v1 lease controller. It makes two functional differences
to the lease controller:
* It no longer ties lease updates to node pings or node status updates
* There is no fallback mechanism to status updates
This also moves vk_envtest, allowing for future brown-box testing of the
lease controller with envtest
There were some (additional) bugs that were easy-ish to introduce
by interleaving the provider provided node, and the server provided
updated node. This removes the chance of that confusion.
This moves the job of pinging the node provider into its own
goroutine. If it takes a long time, it shouldn't slow down
leases, and vice-versa.
It also adds timeouts for node pings. One of the problems
is that we don't know how long a node ping will take --
there could be a bunch of network calls underneath us.
The point of the lease is to say whether or not the
Kubelet is unreachable, not whether or not the node
pings are "passing".
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This sets / updates the node lease owner reference to the current
node. Previously, we did not set this, which had the interesting
problem of leaking node leases on clusters with node churn.
As described in the patch itself, there is a case that if a node is updated out of
band (e.g. node-problem-detector (https://github.com/kubernetes/node-problem-detector)),
we will overwrite the patch in our typicaly two-way strategic patch for node status
updates.
The reason why the standard kubelet can do this is because the flow goes:
apiserver->kubelet: Fetch current node
kubelet->kubelet: Update apiserver's snapshot with local state changes
kubelet->apiserver: patch
We don't have this luxury, as we rely on providers making a callback into us
in order to get the most recent pod status. They do not have a way
to do that merge operation themselves, and a two-way merge doesn't
give us enough metadata.
In order to work around this, we perform a three-way merge on behalf of
the user. We do this by stashing the contents of the last update inside
of it. We then fetch that status back, and use that for the future
update itself.
In the upgrade case, or the case where the VK has been created by
"someone else", we do not know which attributes were created by
or written by us, so we cannot generate a three way patch.
In this case, we will do our best to avoid deleting any attributes,
and only overwrite them. We will consider all current api server
values written by "someone else", and not edit them. This is done
by considering the "old node" to be empty.
* Upgrade k8s.io/api
go get k8s.io/api@kubernetes-1.15.2
* Upgrade k8s.io/apimachinery
go get k8s.io/apimachinery@kubernetes-1.15.2
* Upgrade kubernetes-1.15.2
go get k8s.io/client-go@kubernetes-1.15.2
* Upgrade kk8s.io/kubernetes to v1.15.2
go get k8s.io/kubernetes@v1.15.2
This also locks the the dependency for
github.com/prometheus/client_golang/prometheus due to a golang bug, and to
please the validation scripts.
The replaces were generated by:
go get k8s.io/kubernetes@v1.15.2 2> fail
for i in $(cat fail|grep unknown|cut -f1 -d@|cut -f2 -d" ")
do echo "replace ${i} => ${i} kubernetes-1.15.2"
done
* Move tracing exporter registration
This doesn't belong in the library and should be configured by the
consumer of the opencensus package.
* Rename `vkublet` package to `node`
`vkubelet` does not convey any information to the consumers of the
package.
Really it would be nice to move this package to the root of the repo,
but then you wind up with... interesting... import semantics due to the
repo name... and after thinking about it some, a subpackage is really
not so bad as long as it has a name that convey's some information.
`node` was chosen since this package deals with all the semantics of
operating a node in Kubernetes.