Commit Graph

82 Commits

Author SHA1 Message Date
Paulo Pires
323c02d468 env: fix resource reference Optional nil pointer (#491)
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2019-01-08 10:52:56 -08:00
Brian Goff
5796be449b Adds some package docs (#479)
Was just browing godoc and noticed we are missing some docs that would
be quite useful.
2019-01-07 11:03:35 -08:00
Brian Goff
3ab101da00 Use timer instead of ticker (#477)
Tickers always tick, so if we tick every 5 seconds and the work that we
perform at each tick takes 5 seconds, we end up just looping with no
sleep period.

Instead this is using a timer to ensure we actually get a full 5 second
sleep between loops.

We should consider an async API instead of polling the provider like
this.
2018-12-21 15:48:47 -08:00
Brian Goff
0d14914e85 Refactor http server stuff (#466)
* Don't start things in New

* Move http server handling up to daemon.

This removes the burdern of dealing with listeners, http servers, etc in
the core framework.

Instead provide helpers to attach the appropriate routes to the
caller's serve mux.

With this change, the vkubelet package only helps callers setup HTTP
rather than forcing a specific HTTP config on them.
2018-12-21 11:45:07 -08:00
Paulo Pires
5a0093ce31 vkubelet: set kubelet version to build version (#446)
* deps: bump to Kubernetes 1.13.1

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* version: new VK version

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-18 17:08:23 -08:00
Paulo Pires
4c80760079 tests: add "test/util" subpackage
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:42 +00:00
Paulo Pires
8bcbbf58cd env: rename methods and improve readability
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:41 +00:00
Paulo Pires
f839db4692 tests: envvars processing
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:40 +00:00
Paulo Pires
103a19fe9d env: observe envFrom
Also observe initContainers env and envFrom.

Fixes #460
Fixes #461

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:40 +00:00
Paulo Pires
62b46d971c env: emit events for missing envvars
Fixes #465

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-12-15 11:01:36 +00:00
Tarun Pothulapati
fbae26fc11 env: fix pod envFrom processing 2018-12-12 13:18:39 +00:00
Paulo Pires
d73e563b97 Merge branch 'master' into stop_ticker 2018-12-12 12:36:20 +00:00
Brian Goff
616d12ed76 Remove old pod notification stuff
These are no longer used since we started using the k8s client's queue.
2018-12-10 13:40:21 -08:00
Brian Goff
e6ca19d059 Ensure reconcile ticker stops on shutdown
Otherwise this ticker could run forever (or until the process exits).
2018-12-10 10:33:36 -08:00
Brian Goff
ab7c55cb5f Make pod status updates concurrent. (#433)
This uses the same number of workers as the pod sync workers.

We may want to start a worker queue here instead, but I think for now
this is ok, particularly because we are limiting the number of
goroutines being spun up at once.
2018-12-04 14:03:45 -08:00
Paulo Pires
28a757f4da use shared informers and workqueue (#425)
* vendor: add vendored code

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* controller: use shared informers and a work queue

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* errors: use cpuguy83/strongerrors

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* aci: fix test that uses resource manager

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* readme: clarify skaffold run before e2e

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: use root context everywhere

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: refactor pod lifecycle management

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: fix race in test when observing deletions

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* e2e: test pod forced deletion

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* cmd: fix root context potential leak

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename metaKey

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove calls to HandleError

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* Revert "errors: use cpuguy83/strongerrors"

This reverts commit f031fc6d.

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* manager: remove redundant lister constraint

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: rename the pod event recorder

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: amend misleading comment

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* mock: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: add tracing

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* test: observe timeouts

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* trace: remove unnecessary comments

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: limit concurrency in deleteDanglingPods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: never store context, always pass in calls

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: remove HandleCrash and just panic

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: don't sync succeeded pods

Signed-off-by: Paulo Pires <pjpires@gmail.com>

* sync: ensure pod deletion from kubernetes

Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-11-30 15:53:58 -08:00
Paulo Pires
0f8ef994a3 sync: don't swallow delete errors
Signed-off-by: Paulo Pires <pjpires@gmail.com>
2018-11-28 20:31:55 +00:00
Brian Goff
aee1fde504 Fix a case where provider pod status is not found
Updates the pod status in Kubernetes to "Failed" when the pod status is
not found from the provider.

Note that currently thet most providers return `nil, nil` when a pod is
not found. This works but should possibly return a typed error so we can
determine if the error means not found or something else... but this
works as is so I haven't changed it.
2018-11-06 16:11:42 -08:00
Brian Goff
bec818bf3c Do not close pod sync, use context cancel instead. (#402)
Closing the channel is racey and can lead to a panic on exit.
Instead rely on context cancellation to know if workers should exit.
2018-11-05 11:37:00 -08:00
robbiezhang
966c76368f user %T instead of reflect.TypeOf 2018-10-18 20:06:03 +00:00
robbiezhang
a6bab6e3bb Fix the potential runtime type casting error 2018-10-18 19:15:05 +00:00
Robbie Zhang
4a7b74ed42 [VK] Use Cache controller and Make create/delete pod Concurrently (#373)
* Add k8s.io/client-go/tools/cache package

* Add cache controller

* Add pod creator and terminator

* Pod Synchronizer

* Clean up

* Add back reconcile

* Remove unnecessary space in log

* Incorprate feedbacks

* dep ensure

* Fix the syntax error

* Fix the merge errors

* Minor Refactor

* Set status

* Pass context together with the pod to the pod channel

* Change to use flag to specify the number of pod sync workers

* Remove the unused const

* Use Stable PROD Region WestUS in Test

EastUS2EUAP is not reliable
2018-10-16 17:20:02 -07:00
Brian Goff
c1fe923131 Minor refactorings (#368)
* Split vkubelet funcitons into separate files.

* Minor re-org for cmd/census*

* refactor run loop
2018-10-12 17:36:37 -07:00
Brian Goff
682b2bccf8 Add support for tracing via OpenCencus
This adds a few flags for configuring the tracer.
Includes support for jaeger tracing (built into OC).
2018-09-26 13:48:40 -07:00
Brian Goff
083f6dee05 Refactor provider init (#360)
* Refactor provider init

This moves provider init out of vkubelet setup, instead preferring to
initialize vkubelet with a provider.

* Split API server configuration from setup.

This makes sure that configuration (which is done primarily through env
vars) is separate from actually standing up the servers.

This also makes sure to abort daemon initialization if the API servers
are not able to start.
2018-09-26 13:18:02 -07:00
Robbie Zhang
6b97713af3 Set the pod phase based on pod restart policy when provider failed (#361)
Update the resource manager to include the deleting pods in the GetPods function
2018-09-26 10:29:55 -07:00
Robbie Zhang
87acc00457 Merge branch 'master' into alicloud-eci 2018-09-24 12:33:19 -07:00
shidao-ytt
e9d17c23d3 Add Alibaba Cloud ECI Provider
Alibaba Cloud ECI(Elastic Container Instance) is a service that allow you
run containers without having to manage servers or clusters.

This commit add ECI provider for virtual kubelet, connects ECI with
kubernetes cluster.

Signed-off-by: xianwei.zw <xianwei.zw@alibaba-inc.com>
Signed-off-by: shidao.ytt <shidao.ytt@alibaba-inc.com>
2018-09-23 23:29:06 +08:00
Brian Goff
da5e24ef4d Move API handlers to separate package
This makes the package split a little cleaner and easier to import the
HTTP handlers for other consumers.
2018-09-18 11:08:24 -07:00
Brian Goff
74f76c75d5 Instrustment handlers for logging/error handling
This refactors a bit of the http handler code.
Moves error handling for handler functions to a generic handler.
This also has a side-effect of being able to propagate errors from the
provider to send the correct status code, provided the error type
matches a pre-defined interface.
2018-09-17 16:54:24 -07:00
Brian Goff
8eb6ab4bcd Remove intermediate API server objects
Instead just generate HTTP handler functions directly.
2018-09-17 14:47:26 -07:00
Brian Goff
8091b089a2 Plumb context to providers 2018-09-13 13:49:26 -07:00
robbiezhang
4e20fc40ca Override the host in kubeconfig if MASTER_URI EnvVar is set 2018-09-10 12:56:50 -07:00
robbiezhang
0f54e1ed9c Bug fixes 2018-09-07 18:46:49 -07:00
Robbie Zhang
b019ec5549 Bug Fixes (#329) 2018-08-27 11:53:59 -07:00
Brian Goff
8de6693460 Don't use globals for API server
Refactors how HTTP servers are started and binds them to objects that
can store the provider rather than relying on a global.
2018-08-20 11:52:54 -07:00
Brian Goff
e8abca0ac9 Add supports for stats in ACI provider
This adds a new, optional, interface for providers that want to provide
stats.
2018-08-17 17:03:25 -07:00
Brian Goff
1e774a32b3 Use standard logging package (#323) 2018-08-17 16:50:24 -07:00
Robbie Zhang
d7f97b9bfc If --taint is specified, set the taint value to empty (#322)
Add the old tolerations the examples to make it backward compatible during the switch
2018-08-15 17:44:51 -07:00
ramz
4875364f37 Merge branch 'master' into darwin-support 2018-08-15 15:38:35 +10:00
ageekymonk
662d1f07a8 Darwin specific lookup file without cri and vic 2018-08-15 15:26:10 +10:00
ageekymonk
eb60985bd7 Add darwin tag to not build for osx 2018-08-15 15:25:45 +10:00
Jacob LeGrone
5115c1e5cd Add back deprecated taint flag
TODO: Revert this commit

Related to #316
2018-08-14 17:09:44 -07:00
Jacob LeGrone
d47a0b2fc0 Add default provider taint and taint configuration options
This allows for more specificity when setting taint tolerations for
workloads. Three new env variables are introduced:

VKUBELET_TAINT_KEY (defaults to `virtual-kubelet.io/provider`)
VKUBELET_TAINT_VALUE (defaults to provider name)
VKUBELET_TAINT_EFFECT (defaults to `NoSchedule`)

BREAKING CHANGES:
- The default taint key of `azure.com/aci` is now
  `virtual-kubelet.io/provider`.
- Specifying a custom taint key is now done via an environment variable
  rather than the `--taint` command line flag.
2018-08-14 17:09:44 -07:00
Nick Maliwacki
bf02f887f0 Fix to build virtual-kubelet in windows 2018-08-09 18:31:35 -07:00
yaron2
36db5d9583 added Service Fabric Mesh provider 2018-07-31 16:00:56 -07:00
Robbie Zhang
3f83588e59 Reduce ACI API calls (#282)
* Reduce ACI API calls

Reduce reconcile calls and API calls in reconcile

* Fix the pod status update issue

* Revert a few unnecessary change
2018-07-31 13:31:00 -07:00
Daniel Mueller
31a415c83a Fix bug in exec command retrieval (#265)
The exec command as extracted from the query comprises only the first
part of the command and does not include potentially supplied
parameters. E.g.,
  $ kubectl exec pod -- ls -t /usr
  > command: ls

This change fixes the problem by moving away from the Query.Get API.
  $ kubectl exec pod -- ls -t /usr
  > command: [ls -t /usr]
2018-07-25 11:54:22 -07:00
Robbie Zhang
6723b0d719 Register the Node when GetNode Returns NotFound (#254) 2018-07-11 14:57:09 -07:00
Eric Jadi
6543b0d410 Added missing providers to providers.go for build-time validation (#253)
* build all providers compile-time to validate interface implementation

* removed duplicate CRI provider initialization
2018-07-09 11:56:17 -07:00