Commit Graph

881 Commits

Author SHA1 Message Date
Sargun Dhillon
616538ef01 Merge pull request #955 from sargun/fix-pod-status-update
Fix pod status update
v1.5.0
2021-02-17 12:02:38 -08:00
Sargun Dhillon
c4582ccfbc Allow providers to update pod statuses
We had added an optimization that made it so we dedupe pod status updates
from the provider. This ignored two subfields that could be updated along
with status.

Because the details of subresource updating is a bit API server centric,
I wrote an envtest which checks for this behaviour.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2021-02-16 12:30:53 -08:00
Sargun Dhillon
7feb175720 Split up lifecycle test wireUpSystem function
This splits up the wireUpSystem function into a chunk that makes it
"client agnostic". It also removes the requirement that the client
is faked.
2021-02-16 12:30:51 -08:00
Sargun Dhillon
0e1cc1566e Create envtest wrapper
Lift up a little bit of the common envtest code into a common wrapper function.
2021-02-16 12:30:51 -08:00
Pires
d11968a0fd Merge pull request #908 from cwdsuzhou/race_delete
Fix race between k8s and provider when deleting pod
2021-02-16 15:44:48 +00:00
wadecai
3ff1694252 Fix race between k8s and provider when deleting pod 2021-02-16 17:45:55 +08:00
Sargun Dhillon
53e96e03a9 Merge pull request #952 from sargun/add-tracking-info
Add Alternative Workqueue Implementation
2021-02-12 15:20:06 -08:00
Sargun Dhillon
3a361ebabd queue: Add tracing
This adds tracing throughout the queues, so we can determine what's going on.
2021-02-08 11:07:03 -08:00
Sargun Dhillon
ac9a1af564 Replace golang workqueue with our own
This is a fundamentally different API than that of the K8s workqueue
which is better suited for our needs. Specifically, we need a simple
queue which doesn't have complex features like delayed adds that
sit on "external" goroutines.

In addition, we need deep introspection into the operations of the
workqueue. Although you can get this on top of the K8s workqueue
by implementing a custom rate limiter, the problem is that
the underlying rate limiter's behaviour is still somewhat
opaque.

This basically has 100% code coverage.
2021-02-08 11:07:03 -08:00
Sargun Dhillon
fd3da8dcad Merge pull request #954 from feiskyer/clean-charts
clean-up charts
2021-02-07 23:49:54 -08:00
Pengfei Ni
731d0d6f5c clean-up charts 2021-02-08 13:18:00 +08:00
Sargun Dhillon
2ac4ff9b35 Merge pull request #953 from sargun/break-up-ratelimiters 2021-02-03 04:18:07 -08:00
Sargun Dhillon
82452a73a5 Split out rate limiter per workqueue
If you share a ratelimiter between workqueues, it breaks.

WQ1: Starts processing item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
--- At this point we've backed off a bit ---
WQ2: Starts processing item (with same key, When)
WQ2: Succeeds at processing item (Forget)
WQ1: Fails to process item (When) ---> THIS RESULTS IN AN ERROR

This results in an error because it "forgot" the previous
rate limit.
2021-02-02 11:40:58 -08:00
Brian Goff
2fa03a15a2 Merge pull request #951 from Jeffwan/update_email_list
Update mailing list link
2021-01-29 09:25:39 -08:00
Jiaxin Shan
eb7553e6c4 Update mailing list link 2021-01-26 16:21:41 -08:00
Brian Goff
3cfd4737dc Merge pull request #949 from pires/bugfix/klogv2_withfields
log: fix klogv2.WithField(s)
v1.4.0
2021-01-20 09:44:10 -08:00
Pires
346c20c005 log: fix klogv2.WithField(s)
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-20 13:53:23 +00:00
Pires
fa139bfe27 Merge pull request #947 from pires/bugfix/klogv2
log: fix klog depth and output format
2021-01-15 19:00:34 +00:00
Pires
cb0e18e6a1 log: refactor klogv2 tests
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-15 18:47:50 +00:00
Brian Goff
379031eb61 Merge pull request #945 from miekg/no-kube2 2021-01-15 10:23:35 -08:00
Pires
25b8c546a0 log: process fields only on first klog call
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-15 18:23:32 +00:00
Miek Gieben
53fcbe7abe Merge branch 'master' into no-kube2 2021-01-15 09:42:44 +01:00
Miek Gieben
7662a48922 Skip internal/kubenernetes from linting
Signed-off-by: Miek Gieben <miek@miek.nl>
2021-01-15 09:25:17 +01:00
Pires
46bfb01cd4 log: fix klogv2 With* funcs
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-14 21:07:04 +00:00
Pires
c2b4863f40 log: fix klog depth
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-14 20:23:46 +00:00
Brian Goff
5edfe23bd5 Merge pull request #944 from miekg/no-kubernetes 2021-01-13 15:48:33 -08:00
Miek Gieben
c9969ee33d Import kubernetes/remotecommand
Copy/paste some more kubernetes code. This is to remove the dep on
kubernetes/kubernetes from within exec.go

See #940

Signed-off-by: Miek Gieben <miek@miek.nl>
2021-01-12 13:18:30 +01:00
Miek Gieben
ff61469113 Merge branch 'master' into no-kubernetes 2021-01-12 07:44:42 +01:00
Pires
20c064848a Merge pull request #942 from pires/chore/golang_1.15
build: use Go 1.15
2021-01-11 21:29:29 +00:00
Pires
9e711f3276 Merge pull request #941 from pires/feature/klogv2
log: add klog/v2
2021-01-11 18:24:31 +00:00
Pires
be76c022ae log: validate log.Logger impl at compile time
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-11 18:17:01 +00:00
Miek Gieben
e82e46e5de Copy past golang/expansion from kubernetes/kubernetes
Try to stop depending on kubernetes/kubenetes. Copy golang/expansion
into the virtual-kubelet repo. The upstream code looks super stable, so
there is little harm to copy it here.

Signed-off-by: Miek Gieben <miek@miek.nl>
2021-01-11 11:56:24 +01:00
Pires
eda7adbdb4 log: add klog/v2
Fixes #924

Signed-off-by: Pires <pjpires@gmail.com>
2021-01-10 23:42:06 +00:00
Pires
9affe97f88 Merge pull request #943 from pires/chore/bump_k8s
e2e: test with Kubernetes to 1.20.1
2021-01-10 23:40:10 +00:00
Pires
9e522952c3 e2e: test with Kubernetes to 1.20.1
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-10 17:11:53 +00:00
Pires
99ad66814b build: use Go 1.15
Signed-off-by: Pires <pjpires@gmail.com>
2021-01-10 16:41:31 +00:00
Brian Goff
9745a6a9bc Merge pull request #939 from sargun/fix-name
Fix key name in log entry
2021-01-08 10:43:50 -08:00
Sargun Dhillon
3830b0ed79 Fix key name in log entry 2021-01-08 01:00:22 -08:00
Sargun Dhillon
1b8597647b Refactor queue code
This refactor is a preparation for another commit. I want to add instrumentation
around our queues. The code of how queues were handled was spread throughout
the code base, and that made adding such instrumentation nice and complicated.

This centralizes the queue management logic in queue.go, and only requires
the user to provide a (custom) rate limiter, if they want to, a name,
and a handler.

The lease code is moved into its own package to simplify testing, because
the goroutine leak tester was triggering incorrectly if other tests
were running, and it was measuring leaks from those tests.

This also identified buggy behaviour:

wq := workqueue.NewNamedRateLimitingQueue(workqueue.DefaultItemBasedRateLimiter(), "test")
wq.AddRateLimited("hi")
fmt.Printf("Added hi, len: %d\n", wq.Len())

wq.Forget("hi")
fmt.Printf("Forgot hi, len: %d\n", wq.Len())

wq.Done("hi")
fmt.Printf("Done hi, len: %d\n", wq.Len())

---
Prints all 0s because event non-delayed items are delayed. If you call Add
directly, then the last line prints a len of 2.

// Workqueue docs:
// Forget indicates that an item is finished being retried.  Doesn't matter whether it's for perm failing
// or for success, we'll stop the rate limiter from tracking it.  This only clears the `rateLimiter`, you
// still have to call `Done` on the queue.

^----- Even this seems untrue
2021-01-08 00:56:05 -08:00
Sargun Dhillon
735eb34829 This adds the v1 lease controller
This refactors the v1 lease controller. It makes two functional differences
to the lease controller:
* It no longer ties lease updates to node pings or node status updates
* There is no fallback mechanism to status updates

This also moves vk_envtest, allowing for future brown-box testing of the
lease controller with envtest
2021-01-05 11:40:44 -08:00
Chris Aniszczyk
8affa1c42a Add CodeQL Security Scanning
Signed-off-by: Chris Aniszczyk <caniszczyk@gmail.com>
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-12-14 20:13:40 -08:00
Brian Goff
076b28d2b5 Merge pull request #902 from sargun/fix-899
Fix issue #899: Pod status out of sync
2020-12-07 17:14:52 -08:00
Sargun Dhillon
de7f7dd173 Fix issue #899: Pod status out of sync after being marked as not ready by controller manager
As described in the issue, if the following sequence happens, we fail to properly
update the pod status in api server:

1. Create pod in k8s
2. Provider creates the pod and syncs its status back
3. Pod in k8s ready/running, all fine
4. Virtual kubelet fails to update node status for some time for whatever reason (e.g. network connectivity issues)
5. Virtual node marked as NotReady with message: Kubelet stopped posting node status
6. kube-controller-manager of k8s, goes and marks all pods as Ready = false:
7. Virtual kubelet never sync's status of pod in provider back to k8s
2020-12-07 16:50:00 -08:00
Sargun Dhillon
0d1f6f1625 Add Stutter linter
This also adds a bunch of nolints for the node package which
has a ton of stuttering. Perhaps something to mitigate in another
iteration.
2020-12-07 08:51:57 -08:00
Sargun Dhillon
d29adf5ce3 Add Gocritic
This also fixes the issues laid out by gocritic
2020-12-06 13:20:03 -08:00
Sargun Dhillon
ffbfe19e78 Add tests for opencensus (logger) fields 2020-12-06 13:20:03 -08:00
Sargun Dhillon
9a60ea2494 Switch to using Makefile for lint
This uses gobin in the Makefile for golint. Gobin allows for easier
pinning of dependencies that are project specific, as in the actual
gobin command invocation you can specify the version.
2020-12-06 13:20:03 -08:00
Sargun Dhillon
c0d5809285 Add nolintlint to warn us of extraneous nolint comments 2020-12-05 10:59:10 -08:00
Sargun Dhillon
bbe4551940 Fix linter exemptions in golint
We were having issues with golint not properly reporting declaration of functions
without proper documentation (comments). This is due to a config with golangci.

See: https://github.com/golangci/golangci-lint/issues/456
2020-12-05 10:59:10 -08:00
Sargun Dhillon
ca84620958 Fix gosimple check
We were doing a select without needing to.
2020-12-04 13:21:37 -08:00