This uses the same number of workers as the pod sync workers.
We may want to start a worker queue here instead, but I think for now
this is ok, particularly because we are limiting the number of
goroutines being spun up at once.
* vendor: add vendored code
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* controller: use shared informers and a work queue
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* errors: use cpuguy83/strongerrors
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* aci: fix test that uses resource manager
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* readme: clarify skaffold run before e2e
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* cmd: use root context everywhere
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: refactor pod lifecycle management
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* e2e: fix race in test when observing deletions
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* e2e: test pod forced deletion
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* cmd: fix root context potential leak
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: rename metaKey
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: remove calls to HandleError
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* Revert "errors: use cpuguy83/strongerrors"
This reverts commit f031fc6d.
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* manager: remove redundant lister constraint
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: rename the pod event recorder
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: amend misleading comment
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* mock: add tracing
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: add tracing
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* test: observe timeouts
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* trace: remove unnecessary comments
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: limit concurrency in deleteDanglingPods
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: never store context, always pass in calls
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: remove HandleCrash and just panic
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: don't sync succeeded pods
Signed-off-by: Paulo Pires <pjpires@gmail.com>
* sync: ensure pod deletion from kubernetes
Signed-off-by: Paulo Pires <pjpires@gmail.com>
Updates the pod status in Kubernetes to "Failed" when the pod status is
not found from the provider.
Note that currently thet most providers return `nil, nil` when a pod is
not found. This works but should possibly return a typed error so we can
determine if the error means not found or something else... but this
works as is so I haven't changed it.
* Add k8s.io/client-go/tools/cache package
* Add cache controller
* Add pod creator and terminator
* Pod Synchronizer
* Clean up
* Add back reconcile
* Remove unnecessary space in log
* Incorprate feedbacks
* dep ensure
* Fix the syntax error
* Fix the merge errors
* Minor Refactor
* Set status
* Pass context together with the pod to the pod channel
* Change to use flag to specify the number of pod sync workers
* Remove the unused const
* Use Stable PROD Region WestUS in Test
EastUS2EUAP is not reliable
* Refactor provider init
This moves provider init out of vkubelet setup, instead preferring to
initialize vkubelet with a provider.
* Split API server configuration from setup.
This makes sure that configuration (which is done primarily through env
vars) is separate from actually standing up the servers.
This also makes sure to abort daemon initialization if the API servers
are not able to start.
Alibaba Cloud ECI(Elastic Container Instance) is a service that allow you
run containers without having to manage servers or clusters.
This commit add ECI provider for virtual kubelet, connects ECI with
kubernetes cluster.
Signed-off-by: xianwei.zw <xianwei.zw@alibaba-inc.com>
Signed-off-by: shidao.ytt <shidao.ytt@alibaba-inc.com>
This refactors a bit of the http handler code.
Moves error handling for handler functions to a generic handler.
This also has a side-effect of being able to propagate errors from the
provider to send the correct status code, provided the error type
matches a pre-defined interface.
This allows for more specificity when setting taint tolerations for
workloads. Three new env variables are introduced:
VKUBELET_TAINT_KEY (defaults to `virtual-kubelet.io/provider`)
VKUBELET_TAINT_VALUE (defaults to provider name)
VKUBELET_TAINT_EFFECT (defaults to `NoSchedule`)
BREAKING CHANGES:
- The default taint key of `azure.com/aci` is now
`virtual-kubelet.io/provider`.
- Specifying a custom taint key is now done via an environment variable
rather than the `--taint` command line flag.
The exec command as extracted from the query comprises only the first
part of the command and does not include potentially supplied
parameters. E.g.,
$ kubectl exec pod -- ls -t /usr
> command: ls
This change fixes the problem by moving away from the Query.Get API.
$ kubectl exec pod -- ls -t /usr
> command: [ls -t /usr]
* Started work on provider
* WIP Adding batch provider
* Working basic call into pool client. Need to parameterize the baseurl
* Fixed job creation by manipulating the content-type
* WIP Kicking off containers. Dirty
* [wip] More meat around scheduling simple containers.
* Working on basic task wrapper to co-schedule pods
* WIP on task wrapper
* WIP
* Working pod minimal wrapper for batch
* Integrate pod template code into provider
* Cleaning up
* Move to docker without gpu
* WIP batch integration
* partially working
* Working logs
* Tidy code
* WIP: Testing and readme
* Added readme and terraform deployment for GPU Azure Batch pool.
* Update to enable low priority nodes for gpu
* Fix log formatting bug. Return node logs when container not yet started
* Moved to golang v1.10
* Fix cri test
* Fix up minor docs Issue. Add provider to readme. Add var for vk image.
* Add Virtual Kubelet provider for VIC
Initial virtual kubelet provider for VMware VIC. This provider currently
handles creating and starting of a pod VM via the VIC portlayer and persona
server. Image store handling via the VIC persona server. This provider
currently requires the feature/wolfpack branch of VIC.
* Added pod stop and delete. Also added node capacity.
Added the ability to stop and delete pod VMs via VIC. Also retrieve
node capacity information from the VCH.
* Cleanup and readme file
Some file clean up and added a Readme.md markdown file for the VIC
provider.
* Cleaned up errors, added function comments, moved operation code
1. Cleaned up error handling. Set standard for creating errors.
2. Added method prototype comments for all interface functions.
3. Moved PodCreator, PodStarter, PodStopper, and PodDeleter to a new folder.
* Add mocking code and unit tests for podcache, podcreator, and podstarter
Used the unit test framework used in VIC to handle assertions in the provider's
unit test. Mocking code generated using OSS project mockery, which is compatible
with the testify assertion framework.
* Vendored packages for the VIC provider
Requires feature/wolfpack branch of VIC and a few specific commit sha of
projects used within VIC.
* Implementation of POD Stopper and Deleter unit tests (#4)
* Updated files for initial PR