Commit Graph

17 Commits

Author SHA1 Message Date
Brian Goff
98ca5c8398 Handle not found case for status update
If we don't handle the "pod not found" case then we end up with the pod
getting re-queued over and over until the max retries are hit. It also
blocks the queue for other pod status updates for that pod
namespace/name.
2019-06-04 14:39:58 -07:00
Brian Goff
d6b5ae3710 Remove usage of ocstatus package
This changes the tracing package to accept an error on SetStatus, which
is really what we always want anyway.
This also decouples the trace package from opencensus.
2019-06-04 14:29:25 -07:00
Brian Goff
71546a908f Remove Server object (#629)
This had some weird shared responsibility with the PodController.
Instead just move the functionality to the PodController.
2019-06-01 09:36:38 -07:00
Jeremy Rickard
87e72bf4df Light up UpdatePod (#613)
* Light up UpdatePod

This PR updates the vkublet/pod.go createOrUpdate(..) method to actually handle
updates. It gets the pod from the provider as before, but now if it exists the method
checks the hash of the spec against the spec of the new pod. If they've changed, it
calls UpdatePod(..).

Also makes a small change to the Server struct to swap from kuberentes.Clientset to kubernetes.Interface
to better facilitate testing with fake ClientSet.

Co-Authored-By: Brian Goff <cpuguy83@gmail.com>
2019-05-17 11:14:29 -07:00
Brian Goff
1942522cf6 Add async provider pod status updates (#493)
This adds a new interface that a provider can implement which enables
async notifications of pod status changes rather than the existing loop
which goes through every pod in k8s and checks the status in the
provider.
In practice this should be significantly more efficient since we are not
constantly listing all pods and then looking up the status in the
provider.

For providers that do not support this interface, the old method is
still used to sync state from the provider.

This commit does not update any of the providers to support this
interface.
2019-04-01 09:07:26 -07:00
Brian Goff
1bfffa975e Make tracing interface to coalesce logging/tracing (#519)
* Define and use an interface for logging.

This allows alternative implementations to use whatever logging package
they want.

Currently the interface just mimicks what logrus already implements,
with minor modifications to not rely on logrus itself. I think the
interface is pretty solid in terms of logging implementations being able
to do what they need to.

* Make tracing interface to coalesce logging/tracing

Allows us to share data between the tracer and the logger so we can
simplify log/trace handling wher we generally want data to go both
places.
2019-02-22 11:36:03 -08:00
Paulo Pires
103a19fe9d env: observe envFrom
Also observe initContainers env and envFrom.

Fixes #460
Fixes #461

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:40 +00:00
Paulo Pires
62b46d971c env: emit events for missing envvars
Fixes #465

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:36 +00:00
Brian Goff
ab7c55cb5f Make pod status updates concurrent. (#433)
This uses the same number of workers as the pod sync workers.

We may want to start a worker queue here instead, but I think for now
this is ok, particularly because we are limiting the number of
goroutines being spun up at once.
2018-12-04 14:03:45 -08:00
Paulo Pires
28a757f4da use shared informers and workqueue (#425)
* vendor: add vendored code

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* controller: use shared informers and a work queue

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* errors: use cpuguy83/strongerrors

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* aci: fix test that uses resource manager

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* readme: clarify skaffold run before e2e

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: use root context everywhere

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: refactor pod lifecycle management

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: fix race in test when observing deletions

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: test pod forced deletion

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: fix root context potential leak

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename metaKey

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove calls to HandleError

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* Revert "errors: use cpuguy83/strongerrors"

This reverts commit f031fc6d.

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* manager: remove redundant lister constraint

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename the pod event recorder

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: amend misleading comment

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* mock: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* test: observe timeouts

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* trace: remove unnecessary comments

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: limit concurrency in deleteDanglingPods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: never store context, always pass in calls

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove HandleCrash and just panic

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: don't sync succeeded pods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: ensure pod deletion from kubernetes

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-11-30 15:53:58 -08:00
Paulo Pires
0f8ef994a3 sync: don't swallow delete errors
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-11-28 20:31:55 +00:00
Brian Goff
aee1fde504 Fix a case where provider pod status is not found
Updates the pod status in Kubernetes to "Failed" when the pod status is
not found from the provider.

Note that currently thet most providers return `nil, nil` when a pod is
not found. This works but should possibly return a typed error so we can
determine if the error means not found or something else... but this
works as is so I haven't changed it.
2018-11-06 16:11:42 -08:00
Brian Goff
bec818bf3c Do not close pod sync, use context cancel instead. (#402)
Closing the channel is racey and can lead to a panic on exit.
Instead rely on context cancellation to know if workers should exit.
2018-11-05 11:37:00 -08:00
robbiezhang
966c76368f user %T instead of reflect.TypeOf 2018-10-18 20:06:03 +00:00
robbiezhang
a6bab6e3bb Fix the potential runtime type casting error 2018-10-18 19:15:05 +00:00
Robbie Zhang
4a7b74ed42 [VK] Use Cache controller and Make create/delete pod Concurrently (#373)
* Add k8s.io/client-go/tools/cache package

* Add cache controller

* Add pod creator and terminator

* Pod Synchronizer

* Clean up

* Add back reconcile

* Remove unnecessary space in log

* Incorprate feedbacks

* dep ensure

* Fix the syntax error

* Fix the merge errors

* Minor Refactor

* Set status

* Pass context together with the pod to the pod channel

* Change to use flag to specify the number of pod sync workers

* Remove the unused const

* Use Stable PROD Region WestUS in Test

EastUS2EUAP is not reliable
2018-10-16 17:20:02 -07:00
Brian Goff
c1fe923131 Minor refactorings (#368)
* Split vkubelet funcitons into separate files.

* Minor re-org for cmd/census*

* refactor run loop
2018-10-12 17:36:37 -07:00