33 Commits

Author SHA1 Message Date
Yash Desai
750de3195d Resource manager: add service lister and remove unused lock. (#559)
* Remove unused lock from the resource manager.

* Add service lister to the resource manager.

This change adds a service lister in the
resource manager.
This will be used to set the service env vars.
Also added a List method to the resource manager
and a simple test to confirm it's a pass through.
2019-04-03 11:19:30 -07:00
Brian Goff
10430f0b7f Add node provider interfaace (#526)
This starts the work of having a `NodeProvider` which is responsible for
providing node details.
It splits the responsibilities of node management off to a new
controller.

The primary change here is to add the framework pieces for node
management and move the VK CLI to use this new controller.

It also adds support for node leases where available. This can be
enabled via the command line (disabled by default), but may fall back if
we find that leaess aren't supported on the cluster.
2019-03-25 15:02:40 -07:00
Brian Goff
f8c51004d4 Support building an allow-list of providers (#527)
* Add providers subcommand to verify providers

Allows users to check what providers are available

* Fix version output to add new line

This command was totally broken until we moved around the call to
`initConfig()`, this just fixes the output now that it works.

* Flip boolean of provider include tags

All providers are still included by default and fix tags using the old
format.
2019-03-02 11:25:47 -08:00
Brian Goff
1bfffa975e Make tracing interface to coalesce logging/tracing (#519)
* Define and use an interface for logging.

This allows alternative implementations to use whatever logging package
they want.

Currently the interface just mimicks what logrus already implements,
with minor modifications to not rely on logrus itself. I think the
interface is pretty solid in terms of logging implementations being able
to do what they need to.

* Make tracing interface to coalesce logging/tracing

Allows us to share data between the tracer and the logger so we can
simplify log/trace handling wher we generally want data to go both
places.
2019-02-22 11:36:03 -08:00
Brian Goff
20911aa3b5 fix potential panic on http server close (#496) 2019-01-15 10:37:06 -08:00
Brian Goff
0d14914e85 Refactor http server stuff (#466)
* Don't start things in New

* Move http server handling up to daemon.

This removes the burdern of dealing with listeners, http servers, etc in
the core framework.

Instead provide helpers to attach the appropriate routes to the
caller's serve mux.

With this change, the vkubelet package only helps callers setup HTTP
rather than forcing a specific HTTP config on them.
2018-12-21 11:45:07 -08:00
Brian Goff
82ba002a9f Revert "Use 1 worker by default" (#432)
This reverts commit f10596562d.

Makes our default worker count back to 10 now that concurrency is in
good shape.
2018-12-03 12:49:03 -08:00
Paulo Pires
28a757f4da use shared informers and workqueue (#425)
* vendor: add vendored code

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* controller: use shared informers and a work queue

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* errors: use cpuguy83/strongerrors

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* aci: fix test that uses resource manager

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* readme: clarify skaffold run before e2e

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: use root context everywhere

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: refactor pod lifecycle management

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: fix race in test when observing deletions

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: test pod forced deletion

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: fix root context potential leak

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename metaKey

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove calls to HandleError

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* Revert "errors: use cpuguy83/strongerrors"

This reverts commit f031fc6d.

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* manager: remove redundant lister constraint

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename the pod event recorder

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: amend misleading comment

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* mock: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* test: observe timeouts

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* trace: remove unnecessary comments

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: limit concurrency in deleteDanglingPods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: never store context, always pass in calls

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove HandleCrash and just panic

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: don't sync succeeded pods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: ensure pod deletion from kubernetes

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-11-30 15:53:58 -08:00
Brian Goff
8cc888176a Merge pull request #387 from cpuguy83/ocagent_exporter
Add ocagent exporter
2018-11-06 16:20:55 -08:00
Brian Goff
bec818bf3c Do not close pod sync, use context cancel instead. (#402)
Closing the channel is racey and can lead to a panic on exit.
Instead rely on context cancellation to know if workers should exit.
2018-11-05 11:37:00 -08:00
Brian Goff
f10596562d Use 1 worker by default
This is to work around an issue with concurrent workers and makes the
default config just work.
2018-11-02 12:35:26 -07:00
Brian Goff
143d9f71cc Add ocagent exporter
ocagent allows users to send traces out of VK to a "standard"  external forwarder
(standard as in this is a format/protocol defined in the opencensus project).

This allows uers to implement metrics for whatever backend they want
without having to add it into VK directly.
2018-10-31 14:31:13 -07:00
Robbie Zhang
4a7b74ed42 [VK] Use Cache controller and Make create/delete pod Concurrently (#373)
* Add k8s.io/client-go/tools/cache package

* Add cache controller

* Add pod creator and terminator

* Pod Synchronizer

* Clean up

* Add back reconcile

* Remove unnecessary space in log

* Incorprate feedbacks

* dep ensure

* Fix the syntax error

* Fix the merge errors

* Minor Refactor

* Set status

* Pass context together with the pod to the pod channel

* Change to use flag to specify the number of pod sync workers

* Remove the unused const

* Use Stable PROD Region WestUS in Test

EastUS2EUAP is not reliable
2018-10-16 17:20:02 -07:00
robbiezhang
055f5a2e01 Change the default taint effect to NoSchedule 2018-10-15 19:46:42 +00:00
Brian Goff
c1fe923131 Minor refactorings (#368)
* Split vkubelet funcitons into separate files.

* Minor re-org for cmd/census*

* refactor run loop
2018-10-12 17:36:37 -07:00
robbiezhang
626b346fcb Fix the issue of unable to set log-level 2018-10-09 23:04:42 +00:00
Brian Goff
ae49bbfd11 Fix filename typo s/cencus/census/ 2018-10-04 12:57:04 -07:00
Brian Goff
2fc82818ae Add tests for trace registry 2018-09-26 13:48:40 -07:00
Brian Goff
67c3922863 Add support for zpages 2018-09-26 13:48:40 -07:00
Brian Goff
682b2bccf8 Add support for tracing via OpenCencus
This adds a few flags for configuring the tracer.
Includes support for jaeger tracing (built into OC).
2018-09-26 13:48:40 -07:00
Brian Goff
083f6dee05 Refactor provider init (#360)
* Refactor provider init

This moves provider init out of vkubelet setup, instead preferring to
initialize vkubelet with a provider.

* Split API server configuration from setup.

This makes sure that configuration (which is done primarily through env
vars) is separate from actually standing up the servers.

This also makes sure to abort daemon initialization if the API servers
are not able to start.
2018-09-26 13:18:02 -07:00
Robbie Zhang
24ee86f1bb Merge from master (#328)
* Add default provider taint and taint configuration options

This allows for more specificity when setting taint tolerations for
workloads. Three new env variables are introduced:

VKUBELET_TAINT_KEY (defaults to `virtual-kubelet.io/provider`)
VKUBELET_TAINT_VALUE (defaults to provider name)
VKUBELET_TAINT_EFFECT (defaults to `NoSchedule`)

BREAKING CHANGES:
- The default taint key of `azure.com/aci` is now
  `virtual-kubelet.io/provider`.
- Specifying a custom taint key is now done via an environment variable
  rather than the `--taint` command line flag.

* Add back deprecated taint flag

TODO: Revert this commit

Related to #316

* Add darwin tag to not build for osx

* Darwin specific lookup file without cri and vic

* Fix chart notes template (#317)

Values were moved from env to top level.

* If --taint is specified, set the taint value to empty (#322)

Add the old tolerations the examples to make it backward compatible during the switch

* Use standard logging package (#323)

* Update kubelet vendor to pull in stats API

* Add errgroup dep which will be used for ACI stats

* Add supports for stats in ACI provider

This adds a new, optional, interface for providers that want to provide
stats.

* Don't use globals for API server

Refactors how HTTP servers are started and binds them to objects that
can store the provider rather than relying on a global.

* Fix merge conflict

* Fix couple errors
2018-09-07 18:46:49 -07:00
Brian Goff
e8abca0ac9 Add supports for stats in ACI provider
This adds a new, optional, interface for providers that want to provide
stats.
2018-08-17 17:03:25 -07:00
Brian Goff
1e774a32b3 Use standard logging package (#323) 2018-08-17 16:50:24 -07:00
Jacob LeGrone
5115c1e5cd Add back deprecated taint flag
TODO: Revert this commit

Related to #316
2018-08-14 17:09:44 -07:00
Jacob LeGrone
d47a0b2fc0 Add default provider taint and taint configuration options
This allows for more specificity when setting taint tolerations for
workloads. Three new env variables are introduced:

VKUBELET_TAINT_KEY (defaults to `virtual-kubelet.io/provider`)
VKUBELET_TAINT_VALUE (defaults to provider name)
VKUBELET_TAINT_EFFECT (defaults to `NoSchedule`)

BREAKING CHANGES:
- The default taint key of `azure.com/aci` is now
  `virtual-kubelet.io/provider`.
- Specifying a custom taint key is now done via an environment variable
  rather than the `--taint` command line flag.
2018-08-14 17:09:44 -07:00
Liang Mingqiang
be77566e4b bugfix (#307) 2018-08-09 22:03:05 -07:00
Liang Mingqiang
f9c7af5ec9 read a section of config (#255) 2018-07-31 13:28:42 -07:00
Fei Xu
8068f3cac8 gofmt the project files (#205) 2018-05-18 16:13:34 -07:00
Ara Pulido
c002241e76 Fixed small typo in arg description 2017-12-29 11:40:36 -08:00
Kris Nova
9e31a727a6 Cleaning up program UX
- Remove toggle flag
- Update version message
- Update README.md with new changes
2017-12-09 06:56:16 -06:00
Brian Ketelsen
bfa5f4e1b9 documentation: cleanup cobra help, add environment and config information to README 2017-12-05 17:54:07 -06:00
Ria Bhatia
0075e5b0f3 Initial commit 2017-12-05 17:53:58 -06:00