In error cases these goroutines never exit.
Trying to debug cases we end up with a bunch of these goroutines stuck
making it difficult to troubleshoot.
We could just make a buffered channel, however this will makes it less
clear, in cases of an error, what all is happening.
In 405d5d63b1 we changed for an ordered
list to a map, however the test is order dependent go maps are
randomized.
Change the test to use a slice with an internal type (instead of pulling
back in k8s.io/kubernetes).
Without this change this test will fail occasionally and has no
guarentee for success because of the random order of maps.
These are mostly helper code for setting up env vars. There is a single
file copied verbatim, although much of this downward api stuff is a
copy. We may want to pull this out and do a direct copy of the the code
so it is easier to update and work with upstream to have a shared
package that lives outside of k8s.io/kubernetes for downward api.
This drops another dependency on k8s.io/kubernetes.
This does have the unfortunate side effect that implementers will now
get a compile error until they update their code to use the new type.
Just as a note:
The stats types have moved to k8s.io/kubelet, however the stats types
are only there as of v1.20.
Currently we support older versions than v1.20, and even our go.mod
imports from v1.19.
For now we copy the types in. Later we can remove the type defs and
change them to type aliases to the k8s.io/kubelet types (which prevents
another compile time issue).
Anything relying on type assertions to determine if something implements
this method will, unfortunately, be broken and it will be hard to notice
until runtime. We need to make sure to call this out in the release
notes.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Our current retry policy is naive and only does 20 retries. It is
also based off of the rate limiter. If the user is somewhat aggressive in
rate limiting, but they have a temporary outage on API server, they
may want to continue to delay.
In facts, K8s has a built-in function to suggest delays:
https://pkg.go.dev/k8s.io/apimachinery/pkg/api/errors#SuggestsClientDelay
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This starts watching for events prior to the start of the controller.
This smells like a bug in the fakeclient bits, but it seems to fix
the problem.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
We had added an optimization that made it so we dedupe pod status updates
from the provider. This ignored two subfields that could be updated along
with status.
Because the details of subresource updating is a bit API server centric,
I wrote an envtest which checks for this behaviour.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
This is a fundamentally different API than that of the K8s workqueue
which is better suited for our needs. Specifically, we need a simple
queue which doesn't have complex features like delayed adds that
sit on "external" goroutines.
In addition, we need deep introspection into the operations of the
workqueue. Although you can get this on top of the K8s workqueue
by implementing a custom rate limiter, the problem is that
the underlying rate limiter's behaviour is still somewhat
opaque.
This basically has 100% code coverage.
If you share a ratelimiter between workqueues, it breaks.
WQ1: Starts processing item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
WQ1: Fails to process item (When)
--- At this point we've backed off a bit ---
WQ2: Starts processing item (with same key, When)
WQ2: Succeeds at processing item (Forget)
WQ1: Fails to process item (When) ---> THIS RESULTS IN AN ERROR
This results in an error because it "forgot" the previous
rate limit.
Copy/paste some more kubernetes code. This is to remove the dep on
kubernetes/kubernetes from within exec.go
See #940
Signed-off-by: Miek Gieben <miek@miek.nl>