virtual-kubelet

Author	SHA1	Message	Date
Brian Goff	f63c23108f	Move some boiler plate startup logic to nodeutil This makes a controller that handles the startup for the node and pod controller. Later if we add an "api controller" it can also be added here. This is just part of reducing some of the boiler plate code so it is easier to get off of node-cli.	2021-05-25 17:54:53 +00:00
Sargun Dhillon	735eb34829	This adds the v1 lease controller This refactors the v1 lease controller. It makes two functional differences to the lease controller: * It no longer ties lease updates to node pings or node status updates * There is no fallback mechanism to status updates This also moves vk_envtest, allowing for future brown-box testing of the lease controller with envtest	2021-01-05 11:40:44 -08:00
Sargun Dhillon	0d1f6f1625	Add Stutter linter This also adds a bunch of nolints for the node package which has a ton of stuttering. Perhaps something to mitigate in another iteration.	2020-12-07 08:51:57 -08:00
Sargun Dhillon	d29adf5ce3	Add Gocritic This also fixes the issues laid out by gocritic	2020-12-06 13:20:03 -08:00
Sargun Dhillon	c0d5809285	Add nolintlint to warn us of extraneous nolint comments	2020-12-05 10:59:10 -08:00
Brian Goff	590d2e7f01	Merge pull request #862 from cpuguy83/node_helpers	2020-10-26 15:00:45 -07:00
Sargun Dhillon	946c616c67	Create stronger separation between provider node and server node There were some (additional) bugs that were easy-ish to introduce by interleaving the provider provided node, and the server provided updated node. This removes the chance of that confusion.	2020-10-04 19:52:34 -07:00
Brian Goff	0c64171e85	Add v2 node provider for accepting status updates This allows the use of a built-in provider to do things like mark a node as ready once all the controllers are spun up. The e2e tests now use this instead of waiting on the pod that the vk provider is deployed in to be marked ready (this was waiting on /stats/summary to be serving, which is racey).	2020-09-17 13:52:58 -07:00
Sargun Dhillon	3d1226d45d	Fix logging when leases are mis-set This fixes a small logic bug in the leases code for checking is owner references are not set correctly, and makes it so that we properly log when owner references are set, but not set to the node that is "us".	2020-09-08 12:04:16 -07:00
Sargun Dhillon	cd059d9755	Fix node ping interval code / default setting code Change the place where we set the defaults for node ping and node status interval. This problem manifested itself by the node ping interval being 0 when it was set to the default. This makes two changes: 1. Invalid ping values, and ping timeouts will not allow VK to start up 2. We set the default values very early on in creation of the node controller -- where all the other values are set. Signed-off-by: Sargun Dhillon <sargun@sargun.me>	2020-08-18 00:39:14 -07:00
Sargun Dhillon	6845cf825a	Delete and recreate lease on conflict This takes a somewhat hamfisted approach at dealing with lease conflicts. This can happen if "someone" changes the lease underneath us. Again, this should happen rarely, but it can happen (And does happen in production systems). Signed-off-by: Sargun Dhillon <sargun@sargun.me>	2020-08-17 11:54:43 -07:00
Sargun Dhillon	d390dfce43	Move node pinging to its own goroutine This moves the job of pinging the node provider into its own goroutine. If it takes a long time, it shouldn't slow down leases, and vice-versa. It also adds timeouts for node pings. One of the problems is that we don't know how long a node ping will take -- there could be a bunch of network calls underneath us. The point of the lease is to say whether or not the Kubelet is unreachable, not whether or not the node pings are "passing". Signed-off-by: Sargun Dhillon <sargun@sargun.me>	2020-08-03 10:57:37 -07:00
Sargun Dhillon	4bdcba5b85	Set Node Leader Owner Reference This sets / updates the node lease owner reference to the current node. Previously, we did not set this, which had the interesting problem of leaking node leases on clusters with node churn.	2020-07-31 11:23:47 -07:00
Adrien Trouillaud	845b4cd409	upgrade k8s libs to 1.18.4	2020-07-07 21:00:56 -07:00
Sargun Dhillon	e805cb744a	Introduce three-way patch for proper handling of out-of-band status updates As described in the patch itself, there is a case that if a node is updated out of band (e.g. node-problem-detector (https://github.com/kubernetes/node-problem-detector)), we will overwrite the patch in our typicaly two-way strategic patch for node status updates. The reason why the standard kubelet can do this is because the flow goes: apiserver->kubelet: Fetch current node kubelet->kubelet: Update apiserver's snapshot with local state changes kubelet->apiserver: patch We don't have this luxury, as we rely on providers making a callback into us in order to get the most recent pod status. They do not have a way to do that merge operation themselves, and a two-way merge doesn't give us enough metadata. In order to work around this, we perform a three-way merge on behalf of the user. We do this by stashing the contents of the last update inside of it. We then fetch that status back, and use that for the future update itself. In the upgrade case, or the case where the VK has been created by "someone else", we do not know which attributes were created by or written by us, so we cannot generate a three way patch. In this case, we will do our best to avoid deleting any attributes, and only overwrite them. We will consider all current api server values written by "someone else", and not edit them. This is done by considering the "old node" to be empty.	2020-07-06 11:10:32 -07:00
Brian Goff	5306173408	Merge pull request #846 from sargun/add-trace-to-updateStatus Add instrumentation to node controller (tracing)	2020-07-01 12:53:27 -07:00
Sargun Dhillon	30aabe6fcb	Add instrumentation to node controller (tracing) This adds tracing in node controller in several sections where it was missing.	2020-07-01 12:40:09 -07:00
Sargun Dhillon	1e8c16877d	Make node status updates non-blocking There's a (somewhat) common case we can get into where the node status update loop is busy while a provider is trying to send a node status update. Right now, we block the provider from creating a notification in this case.	2020-07-01 12:32:54 -07:00
Thomas Hartland	c258614d8f	After handling status update, reset update timer with correct duration If the ping timer is being used, it should be reset with the ping update interval. If the status update interval is used then Ping stops being called for long enough to cause kubernetes to mark the node as NotReady.	2019-11-11 14:29:52 +01:00
Sargun Dhillon	7133a372d6	Mark current linting errors as non-errors This is basically claiming linting bankruptcy. It marks all of the issues we had up until this point as nolint.	2019-09-03 11:00:33 -07:00
Guangming Wang	cb307df71e	cleanup: fix some typos in node.go Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>	2019-08-13 11:39:00 -07:00
Brian Goff	8493cbb42a	Unexport node update helper functions (#701 ) Thinking these maybe should either not be exposed or in a separate package. For 1.0 let's unexport them and we may re-introduce later.	2019-07-05 19:24:46 +01:00
Brian Goff	bd742d5d99	Add license details on file heads. (#665 ) Realized as I was starting to copy some stuff to other repos that we should go ahead and add this.	2019-06-13 10:13:14 -07:00
Brian Goff	665b23d273	Fix typo (#663 ) I blame the MBP keyboard...	2019-06-12 13:23:31 -07:00
Brian Goff	a54753cb82	Move around some packages (#658 ) * Move tracing exporter registration This doesn't belong in the library and should be configured by the consumer of the opencensus package. * Rename `vkublet` package to `node` `vkubelet` does not convey any information to the consumers of the package. Really it would be nice to move this package to the root of the repo, but then you wind up with... interesting... import semantics due to the repo name... and after thinking about it some, a subpackage is really not so bad as long as it has a name that convey's some information. `node` was chosen since this package deals with all the semantics of operating a node in Kubernetes.	2019-06-12 13:11:49 +01:00

25 Commits