Commit Graph

20 Commits

Author SHA1 Message Date
Sargun Dhillon
12625131b5 Solve the notification on startup pod status notification race condition
This solves the race condition as described in
https://github.com/virtual-kubelet/virtual-kubelet/issues/836.

It does this by checking two conditions when the possible race condition
is detected.

If we receive a pod notification from the provider, and it is not
in our known pods list:
1. Is our cache in-sync?
2. Is it known to our pod lister?

The first case can happen because of the order we start the
provider and sync our caches. The second case can happen because
even if the cache returns synced, it does not mean all of the call
backs on the informer have quiesced.

This slightly changes the behaviour of notifyPods to that it
can block (especially at startup). We can solve this later
by using something like a fair (ticket?) lock.
2020-07-22 18:57:27 -07:00
wadecai
ca417d5239 Expose the queue rate limiter 2020-06-26 10:45:41 +08:00
wadecai
3db9ab97c6 Avoid enqueue when status of k8s pods change 2020-06-13 13:19:55 +08:00
Brian Goff
4ee2c4d370 Re-add support for sync providers
This brings back support for sync providers by wrapping them in a
provider that handles async notifications.
2019-10-24 09:23:28 -07:00
Sargun Dhillon
d22265e5f5 Do not delete pods in a non-graceful manner
This moves from forcefully deleting pods to deleting pods in a
graceful manner from the API Server. It waits for the pod to
get to a terminal status prior to deleting the pod from api
server.
2019-10-17 09:58:21 -07:00
Sargun Dhillon
4202b03cda Remove sync provider support
This removes the legacy sync provider interface. All new providers
are expected to implement the async NotifyPods interface.

The legacy sync provider interface creates complexities around
how the deletion flow works, and the mixed sync and async APIs
block us from evolving functionality.

This collapses in the NotifyPods interface into the PodLifecycleHandler
interface.
2019-10-02 09:28:09 -07:00
toshi0607
bcfc2accf8 misspell 2019-09-26 20:52:06 +09:00
toshi0607
b712751c6d gofmt 2019-09-26 20:50:36 +09:00
Brian Goff
334baa73cf Merge pull request #743 from chewong/pod-status-nil-pointer
Add unit tests for #584
2019-09-11 14:49:55 -07:00
Brian Goff
bb9ff1adf3 Adds Done() and Err() to pod controller (#735)
Allows callers to wait for pod controller exit in addition to readiness.
This means the caller does not have to deal handling errors from the pod
controller running in a gorutine since it can wait for exit via `Done()`
and check the error with `Err()`
2019-09-10 17:44:19 +01:00
Ernest Wong
fdb0c805f7 Add more unit test to #584 2019-09-05 10:48:35 -07:00
Ernest Wong
dc7ff44303 Add unit tests for #584 2019-09-05 09:49:41 -07:00
Sargun Dhillon
9cce8640a5 Fix linting errors in node/pod_test.go
This moves away from defining pods independently. It moves pod (spec)
generation to an independent function.
2019-09-03 11:00:33 -07:00
Sargun Dhillon
69f1186713 Do not mutate pods, nor hand off pod references to provider
This moves to a model where any time that pods are given to a
provider, it uses a DeepCopy, as opposed to a reference. If the
provider mutates the pod, it prevents it from causing issues
with the informer cache.

It has to use reflect instead of comparing the hashes because
spew prints DeepCopy'd data structures ever so slightly differently.
2019-08-15 09:59:01 -07:00
Sargun Dhillon
fbed4ca702 Remove usage of atomics
It turns out that running atomic.Read(...) in a tight loop breaks
Golang. The goroutine would never yield control over the scheduler,
so we ended up getting into a situation where the test would get
stuck forever. This moves to a different model, in which
there is a condition var, instead of atomics in loops.
2019-08-13 11:25:21 -07:00
Sargun Dhillon
50bbc3d1d4 Add tests around updates
This makes sure the update function works correctly after the pod
is running if the podspec is changed. Upon writing the test, I realized
we were accessing the variables outside of the goroutine that the
workers with tests were running in, and we had no locks. Therefore,
I converted all of those numbers to use atomics.
2019-07-30 09:13:43 -07:00
Sargun Dhillon
4a270fea08 Add a test which tests the e2e lifecycle of the pod controller
This uses the mock provider, so I moved the mock provider to a
location where the node test can use it.
2019-07-30 06:56:54 -07:00
Brian Goff
b915cde1ae Fix error handling for delete pod (#685)
* Fix error handling for delete pod

- Error handling was looking for a k8s error from the provider, but
  providers should be using errdefs.
- Error handling was returning early if pod was not found and deleting
  from k8s in all other cases.

* Don't run unit tests twice
2019-06-29 08:07:24 +01:00
Brian Goff
bd742d5d99 Add license details on file heads. (#665)
Realized as I was starting to copy some stuff to other repos that we
should go ahead and add this.
2019-06-13 10:13:14 -07:00
Brian Goff
a54753cb82 Move around some packages (#658)
* Move tracing exporter registration

This doesn't belong in the library and should be configured by the
consumer of the opencensus package.

* Rename `vkublet` package to `node`

`vkubelet` does not convey any information to the consumers of the
package.
Really it would be nice to move this package to the root of the repo,
but then you wind up with... interesting... import semantics due to the
repo name... and after thinking about it some, a subpackage is really
not so bad as long as it has a name that convey's some information.

`node` was chosen since this package deals with all the semantics of
operating a node in Kubernetes.
2019-06-12 13:11:49 +01:00