173 Commits

Author SHA1 Message Date
Sargun Dhillon
ac9a1af564 Replace golang workqueue with our own
This is a fundamentally different API than that of the K8s workqueue
which is better suited for our needs. Specifically, we need a simple
queue which doesn't have complex features like delayed adds that
sit on "external" goroutines.

In addition, we need deep introspection into the operations of the
workqueue. Although you can get this on top of the K8s workqueue
by implementing a custom rate limiter, the problem is that
the underlying rate limiter's behaviour is still somewhat
opaque.

This basically has 100% code coverage.
2021-02-08 11:07:03 -08:00
Sargun Dhillon
82452a73a5 Split out rate limiter per workqueue
If you share a ratelimiter between workqueues, it breaks.

WQ1: Starts processing item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
--- At this point we've backed off a bit ---
WQ2: Starts processing item (with same key, When)
WQ2: Succeeds at processing item (Forget)
WQ1: Fails to process item (When) ---> THIS RESULTS IN AN ERROR

This results in an error because it "forgot" the previous
rate limit.
2021-02-02 11:40:58 -08:00
Miek Gieben
c9969ee33d Import kubernetes/remotecommand
Copy/paste some more kubernetes code. This is to remove the dep on
kubernetes/kubernetes from within exec.go

See #940

Signed-off-by: Miek Gieben <miek@miek.nl>
2021-01-12 13:18:30 +01:00
Sargun Dhillon
1b8597647b Refactor queue code
This refactor is a preparation for another commit. I want to add instrumentation
around our queues. The code of how queues were handled was spread throughout
the code base, and that made adding such instrumentation nice and complicated.

This centralizes the queue management logic in queue.go, and only requires
the user to provide a (custom) rate limiter, if they want to, a name,
and a handler.

The lease code is moved into its own package to simplify testing, because
the goroutine leak tester was triggering incorrectly if other tests
were running, and it was measuring leaks from those tests.

This also identified buggy behaviour:

wq := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultItemBasedRateLimiter(), "test")
wq.AddRateLimited("hi")
fmt.Printf("Added hi, len: %d\n", wq.Len())

wq.Forget("hi")
fmt.Printf("Forgot hi, len: %d\n", wq.Len())

wq.Done("hi")
fmt.Printf("Done hi, len: %d\n", wq.Len())

---
Prints all 0s because event non-delayed items are delayed. If you call Add
directly, then the last line prints a len of 2.

// Workqueue docs:
// Forget indicates that an item is finished being retried.  Doesn't matter whether it's for perm failing
// or for success, we'll stop the rate limiter from tracking it.  This only clears the `rateLimiter`, you
// still have to call `Done` on the queue.

^----- Even this seems untrue
2021-01-08 00:56:05 -08:00
Sargun Dhillon
735eb34829 This adds the v1 lease controller
This refactors the v1 lease controller. It makes two functional differences
to the lease controller:
* It no longer ties lease updates to node pings or node status updates
* There is no fallback mechanism to status updates

This also moves vk_envtest, allowing for future brown-box testing of the
lease controller with envtest
2021-01-05 11:40:44 -08:00
Sargun Dhillon
de7f7dd173 Fix issue #899: Pod status out of sync after being marked as not ready by controller manager
As described in the issue, if the following sequence happens, we fail to properly
update the pod status in api server:

1. Create pod in k8s
2. Provider creates the pod and syncs its status back
3. Pod in k8s ready/running, all fine
4. Virtual kubelet fails to update node status for some time for whatever reason (e.g. network connectivity issues)
5. Virtual node marked as NotReady with message: Kubelet stopped posting node status
6. kube-controller-manager of k8s, goes and marks all pods as Ready = false:
7. Virtual kubelet never sync's status of pod in provider back to k8s
2020-12-07 16:50:00 -08:00
Sargun Dhillon
0d1f6f1625 Add Stutter linter
This also adds a bunch of nolints for the node package which
has a ton of stuttering. Perhaps something to mitigate in another
iteration.
2020-12-07 08:51:57 -08:00
Sargun Dhillon
d29adf5ce3 Add Gocritic
This also fixes the issues laid out by gocritic
2020-12-06 13:20:03 -08:00
Sargun Dhillon
c0d5809285 Add nolintlint to warn us of extraneous nolint comments 2020-12-05 10:59:10 -08:00
Sargun Dhillon
bbe4551940 Fix linter exemptions in golint
We were having issues with golint not properly reporting declaration of functions
without proper documentation (comments). This is due to a config with golangci.

See: https://github.com/golangci/golangci-lint/issues/456
2020-12-05 10:59:10 -08:00
Brian Goff
4fd2b754b5 Merge pull request #923 from sargun/fix-linter
Enable all linters by default
2020-12-04 10:50:28 -08:00
Sargun Dhillon
11c63bca6f Refactor the way that the that node_ping_controller works
This moves node ping controller to using the new internal lock
API.

The reason for this is twofold:
* The channel approach that was used to notify other
  controllers of changes could only be used once (at startup),
  and couldn't be used in the future to broadcast node
  ping status. The idea idea is here that we could move
  to a sync.Cond style API and only wakeup other controllers
  on change, as opposed to constantly polling each other
* The problem with sync.Cond is that it's not context friendly.
  If we want to do stuff like wait on a sync.cond and use a context
  or a timer or similar, it doesn't work whereas this API allows
  context cancellations on condition change.

The idea is that as we have more controllers that act as centralized
sources of authority, they can broadcast out their state.
2020-12-03 11:40:01 -08:00
Sargun Dhillon
d64d427ec8 Enable all linters by default
This removes the directive from .golangci.yml to disable all linters,
and fixes the relevant bugs / issues that are exposed.
2020-12-03 11:33:06 -08:00
wadecai
966a960eef Allow to delete pod in K8s if it is deleting in Provider
For example:
Provier is a K8s provider, pod created by deployment would be evicted when node is not ready.
If we do not delete pod in K8s, deployment would not create a new one.

Add some tests for updateStatus
2020-11-24 15:04:25 +08:00
Hasan Turken
42f7c56d32 Don't skip pods status update if podStatusReasonProviderFailed
Closes #399

Signed-off-by: Hasan Turken <turkenh@gmail.com>
2020-11-16 13:57:48 -08:00
Brian Goff
2716c38e1f Merge branch 'master' into upgrade-golint-to-v1.32.2 2020-11-06 16:08:09 -08:00
Sargun Dhillon
c437e05ad0 Move env var code into its own package
This creates a new package -- podutils. The env var related code
doesn't really have any business being part of the node package,
and to create a separation of concerns, faster tests, and just
general code isolation and cleanliness, we can move the env
var related code into this package. This change is purely hygiene,
and not logic related.

For node, the package is under internal, because the constructor
references manager, which is an internal package.
2020-11-06 14:49:53 -08:00
Sargun Dhillon
b9303714de Upgrade to golangci-lint v1.32.2 2020-11-06 14:45:19 -08:00
chao zheng
b793d89c66 when initialzing vk, use empty clientcmd.ConfigOverrides instead of nil 2020-10-28 15:34:16 -07:00
Brian Goff
590d2e7f01 Merge pull request #862 from cpuguy83/node_helpers 2020-10-26 15:00:45 -07:00
Sargun Dhillon
84a169f25d Fix golang ci warner 2020-10-04 19:52:34 -07:00
Sargun Dhillon
946c616c67 Create stronger separation between provider node and server node
There were some (additional) bugs that were easy-ish to introduce
by interleaving the provider provided node, and the server provided
updated node. This removes the chance of that confusion.
2020-10-04 19:52:34 -07:00
Sargun Dhillon
1c32b2c8ee Fix data race in test 2020-09-21 23:38:48 -07:00
Sargun Dhillon
cf2d5264a5 Fix datarace in node ping controller 2020-09-21 23:38:43 -07:00
Brian Goff
0c64171e85 Add v2 node provider for accepting status updates
This allows the use of a built-in provider to do things  like mark a node
as ready once all the controllers are spun up.

The e2e tests now use this instead of waiting on the pod that the vk
provider is deployed in to be marked ready (this was waiting on
/stats/summary to be serving, which is racey).
2020-09-17 13:52:58 -07:00
Sargun Dhillon
3d1226d45d Fix logging when leases are mis-set
This fixes a small logic bug in the leases code for checking is owner
references are not set correctly, and makes it so that we properly
log when owner references are set, but not set to the node that
is "us".
2020-09-08 12:04:16 -07:00
Sargun Dhillon
cd059d9755 Fix node ping interval code / default setting code
Change the place where we set the defaults for node ping
and node status interval. This problem manifested itself
by the node ping interval being 0 when it was set to
the default.

This makes two changes:
1. Invalid ping values, and ping timeouts will not
   allow VK to start up
2. We set the default values very early on in creation
   of the node controller -- where all the other values
   are set.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-08-18 00:39:14 -07:00
Sargun Dhillon
6845cf825a Delete and recreate lease on conflict
This takes a somewhat hamfisted approach at dealing with lease
conflicts. This can happen if "someone" changes the lease underneath
us. Again, this should happen rarely, but it can happen (And does
happen in production systems).

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-08-17 11:54:43 -07:00
Sargun Dhillon
d390dfce43 Move node pinging to its own goroutine
This moves the job of pinging the node provider into its own
goroutine. If it takes a long time, it shouldn't slow down
leases, and vice-versa.

It also adds timeouts for node pings. One of the problems
is that we don't know how long a node ping will take --
there could be a bunch of network calls underneath us.

The point of the lease is to say whether or not the
Kubelet is unreachable, not whether or not the node
pings are "passing".

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-08-03 10:57:37 -07:00
Sargun Dhillon
49c596c5ca Split waitableInt into its own test file
This is merely a rearranging of the deck chairs and moving waitable
int into its own file since we intend to use it across multiple
tests.
2020-08-03 10:57:37 -07:00
Brian Goff
3fc79dc677 Merge pull request #871 from virtual-kubelet/set-node-lease-owner
Set Node Leader Owner Reference
2020-07-31 11:51:14 -07:00
Sargun Dhillon
4bdcba5b85 Set Node Leader Owner Reference
This sets / updates the node lease owner reference to the current
node. Previously, we did not set this, which had the interesting
problem of leaking node leases on clusters with node churn.
2020-07-31 11:23:47 -07:00
Brian Goff
c0296b99fd Support custom filter for pod event handlers
This allows users who have a shared informer that is *not* filtering on
node name to supply a filter for event handlers to ensure events do not
fire for pods not scheduled to the node.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-07-30 17:17:42 -07:00
Brian Goff
83f8cd1a58 Add helpers for common setup code
Create a clientset, setup pod informer filters, and setup node lease client.
2020-07-27 14:51:02 -07:00
Brian Goff
af1df79088 Merge pull request #851 from virtual-kubelet/race-condition-2nd 2020-07-23 13:53:58 -07:00
Vilmos Nebehaj
56b248c854 Add GetStatsSummary to PodHandlerConfig
If both the metrics routes and the pod routes are attached to the same
mux with the pattern "/", it will panic. Instead, add the stats handler
function to PodHandlerConfig and set up the route if it is not nil.
2020-07-23 09:50:19 -07:00
Sargun Dhillon
4258c46746 Enhance / cleanup enqueuePodStatusUpdate polling in retry loop 2020-07-22 18:57:27 -07:00
Sargun Dhillon
1e9e055e89 Address concerns with PR
Also, just use Kubernetes waiter library.
2020-07-22 18:57:27 -07:00
Sargun Dhillon
12625131b5 Solve the notification on startup pod status notification race condition
This solves the race condition as described in
https://github.com/virtual-kubelet/virtual-kubelet/issues/836.

It does this by checking two conditions when the possible race condition
is detected.

If we receive a pod notification from the provider, and it is not
in our known pods list:
1. Is our cache in-sync?
2. Is it known to our pod lister?

The first case can happen because of the order we start the
provider and sync our caches. The second case can happen because
even if the cache returns synced, it does not mean all of the call
backs on the informer have quiesced.

This slightly changes the behaviour of notifyPods to that it
can block (especially at startup). We can solve this later
by using something like a fair (ticket?) lock.
2020-07-22 18:57:27 -07:00
Brian Goff
bcb5dfa11c Fix running pods handler on nil lister
This follows suit with other hanlders and returns a NotImplemented
http.HandlerFunc when the lister is nil.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2020-07-14 15:33:59 -07:00
Adrien Trouillaud
845b4cd409 upgrade k8s libs to 1.18.4 2020-07-07 21:00:56 -07:00
Sargun Dhillon
e805cb744a Introduce three-way patch for proper handling of out-of-band status updates
As described in the patch itself, there is a case that if a node is updated out of
band (e.g. node-problem-detector (https://github.com/kubernetes/node-problem-detector)),
we will overwrite the patch in our typicaly two-way strategic patch for node status
updates.

The reason why the standard kubelet can do this is because the flow goes:
apiserver->kubelet: Fetch current node
kubelet->kubelet: Update apiserver's snapshot with local state changes
kubelet->apiserver: patch

We don't have this luxury, as we rely on providers making a callback into us
in order to get the most recent pod status. They do not have a way
to do that merge operation themselves, and a two-way merge doesn't
give us enough metadata.

In order to work around this, we perform a three-way merge on behalf of
the user. We do this by stashing the contents of the last update inside
of it. We then fetch that status back, and use that for the future
update itself.

In the upgrade case, or the case where the VK has been created by
"someone else", we do not know which attributes were created by
or written by us, so we cannot generate a three way patch.

In this case, we will do our best to avoid deleting any attributes,
and only overwrite them. We will consider all current api server
values written by "someone else", and not edit them. This is done
by considering the "old node" to be empty.
2020-07-06 11:10:32 -07:00
Brian Goff
5306173408 Merge pull request #846 from sargun/add-trace-to-updateStatus
Add instrumentation to node controller (tracing)
2020-07-01 12:53:27 -07:00
Sargun Dhillon
30aabe6fcb Add instrumentation to node controller (tracing)
This adds tracing in node controller in several sections where
it was missing.
2020-07-01 12:40:09 -07:00
Sargun Dhillon
1e8c16877d Make node status updates non-blocking
There's a (somewhat) common case we can get into where the node
status update loop is busy while a provider is trying to send
a node status update. Right now, we block the provider from
creating a notification in this case.
2020-07-01 12:32:54 -07:00
wadecai
ca417d5239 Expose the queue rate limiter 2020-06-26 10:45:41 +08:00
wadecai
fedffd6f2c Add parameters to support change work queue qps 2020-06-26 10:44:09 +08:00
Weidong Cai
2398504d08 dedup in updatePodStatus (#830)
Co-authored-by: Brian Goff <cpuguy83@gmail.com>
2020-06-15 14:35:14 -07:00
wadecai
3db9ab97c6 Avoid enqueue when status of k8s pods change 2020-06-13 13:19:55 +08:00
Brian Goff
51b9a6c40d Fix stream timeout defaults
This was an unintentional breaking change in
0bdf742303

A timeout of 0 doesn't make any sense, so use the old values of 30s as a
default.
2020-06-03 10:01:34 -07:00