Commit Graph

514 Commits

Author SHA1 Message Date
Sargun Dhillon
9bf05b525d Remove setting taint during e2e test (#621)
We're in effect testing the K8s scheduler logic in the test
by setting taints, as opposed to the actual VK itself. If we
want to make sure the taint is set, we can just observe the node
object.

Instead, bind to the pod to the VK node explicitly.
2019-05-17 10:49:37 -07:00
Sargun Dhillon
5b3190acb5 Cache go build artifacts (#619) 2019-05-14 16:29:21 -07:00
Brian Goff
e820c905b7 Remove unused Azure scripts (#618)
This were previously used for setting up azure creds for ACI specific
tests, but are no longer needed since that has moved to a separate repo.
2019-05-14 16:02:33 -07:00
Sargun Dhillon
63fa4e124b Add the /runningpods/ api endpoint (#611)
* Add the /runningpods/ api endpoint

This adds an API endpoint from the kubelet (/runningpods/). It is
an endpoint on kubelet which is considered a "debug" endpoint, so
it might be worth exposing through the options, but by default
it is exposed in most k8s configs AFAICT.
2019-05-13 15:10:31 -07:00
Sargun Dhillon
c50f33e701 Add tracing of the kubernetes cluster during testing (#608)
* Add tracing of the kubernetes cluster during testing

This adds tracing to the testing to get the kubelet's logs upon
failure. In addition, it keeps track of the pods, and the node
statuses throughout the test.

* Add arguments to make virtual kubelet's log more useful
2019-05-13 13:23:32 -07:00
Brian Goff
ae5e7953fe Merge pull request #614 from hectorj2f/hectorj2f/fix_gobin_path
Makefile: fix gobin path
2019-05-13 09:36:31 -07:00
Hector Fernandez
49e3cafa76 Makefile: fix gobin path 2019-05-11 13:08:21 +02:00
Brian Goff
024b9e10c6 Merge pull request #607 from sargun/fix-port-setting
Fix being able to set the VK's listening port
2019-05-09 16:22:54 -07:00
Brian Goff
616776f927 Merge pull request #609 from sargun/add-test-timeout
Add timeout to e2e tests
2019-05-09 16:19:59 -07:00
Brian Goff
c88770f2d0 Merge pull request #610 from sargun/fix-mock-logging
Fix formatting logging calls in the mock provider
2019-05-08 19:39:12 -07:00
Sargun Dhillon
0a6fc26064 Fix formatting logging calls in the mock provider
There were a bunch of logging calls in the mock provider which had
formatting in them, but didn't use the log-with-formatting method.
2019-05-08 17:37:20 -07:00
Sargun Dhillon
7d9350e3dd Add timeout to e2e tests
This adds a 5 minute timeout to the end-to-end tests. The end-to-end
tests typically run in under 2 minutes. On Circle-CI the timeout is
10 minutes, at which point, Circle CI just shoots the tests in the
head so we don't get any logs.
2019-05-08 17:24:56 -07:00
Sargun Dhillon
cdfb468f51 Fix being able to set the VK's listening port 2019-05-08 17:18:42 -07:00
Brian Goff
d6e945bb93 Merge pull request #574 from cpuguy83/streaming_logs
Use I/O stream for provider logs interface
2019-05-08 09:25:19 -07:00
Brian Goff
3cc051f7c2 Use I/O stream for provider logs interface
Providers must still update the implementaiton to actually gain any
benefit here, but this makes the provider interface a bit more sane.
2019-05-08 09:17:29 -07:00
Sargun Dhillon
ce5f049401 Add the ability to configure klog from VK (#595)
All of Kubernetes logging is based on klog. Klog currently does
not output any logging information to logrus, so you're flying
somewhat blind to Kubernetes internals.

This exposes the full set of configurables that klog offers,
but decorates (prefixes) the klog configuration with "klog".
2019-05-08 11:37:04 +01:00
Brian Goff
5b9ddadc69 Merge pull request #596 from sargun/error-on-zpages-failure
Do not swallow errors from the zpages server silently
2019-05-07 09:20:11 -07:00
Brian Goff
fad8f6d1d0 Merge branch 'master' into error-on-zpages-failure 2019-05-07 09:14:08 -07:00
Brian Goff
ac847cdb29 Merge pull request #606 from sargun/move-to-lowercase-logrus
Move to lowercase Sirupsen/logrus
2019-05-06 18:08:58 -07:00
Sargun Dhillon
5ef5910b2f re-vendor sirupsen/logrus 2019-05-06 17:26:08 -07:00
Sargun Dhillon
740bec9ea0 Remove old github.com/Sirupsen/logrus 2019-05-06 17:26:05 -07:00
Ben Corrie
99fddc23fe Merge pull request #605 from sflxn/remove-vic-provider
Remove VIC provider code.
2019-05-06 17:17:18 -07:00
sflxn
4feab78b76 Remove VIC provider code.
The VIC provider is stale and there are no developers working on this
anymore.  Removing the provider from the repo.
2019-05-06 16:52:53 -07:00
Sargun Dhillon
ab1d15a96a Do not swallow errors from the zpages server silently
If the zpages server exits for any reason, or is unable to bind,
rather than exiting silently, throw an error.
2019-05-06 11:10:17 -07:00
Kevin亓
cfa37871ab fix: fix a typo in azure/aci_test (#589)
Signed-off-by: KevinBetterQ <1093850932@qq.com>
2019-05-06 09:45:55 -07:00
Luc Perkins
0583b5c4fd Add local CNCF logo (#591)
* Add local CNCF logo

Signed-off-by: lucperkins <lucperkins@gmail.com>

* Update footer icon

Signed-off-by: lucperkins <lucperkins@gmail.com>
2019-05-06 09:44:57 -07:00
Sargun Dhillon
ef62defcea Run "make format" (#603)
There was some code that wasn't formatted according to gofmt. This
fixes that.
2019-05-06 09:26:10 -07:00
Sargun Dhillon
f1cb6a7bf6 Add the concept of startup timeout (#597)
This adds two concepts, where one encompasses the other.

Startup timeout
Startup timeout is how long to wait for the entire kubelet
to get into a functional state. Right now, this only waits
for the pod informer cache for the pod controllerto become
in-sync with API server, but this could be extended to other
informers (like secrets informer).

Wait For Startup
This changes the behaviour of the virtual kubelet to wait
for the pod controller to start before registering the node.

It is to avoid the race condition where the node is registered,
but we cannot actually do any pod operations.
2019-05-06 09:25:00 -07:00
Sargun Dhillon
74a16f7f9a Use gobin to fix version numbers of tools (#598)
I ran into a bunch of problems running goreleasers, and
some differences with goimports. This locks the versions
to versions that appear to work.

The goimports version is newer than the latest version run
on the repo, but it matches the version of Go used on the rest of
the project.
2019-05-06 09:03:44 -07:00
Stuart Leeks
eb87db8731 typo (#588) 2019-05-01 08:06:30 -07:00
Luc Perkins
aebc81ec1c Add local logos (#585)
* Add local logos

Signed-off-by: lucperkins <lucperkins@gmail.com>

* Fix Hugo version name in Netlify config

Signed-off-by: lucperkins <lucperkins@gmail.com>

* Fix URL for CNCF logo

Signed-off-by: lucperkins <lucperkins@gmail.com>
2019-04-26 17:11:31 -07:00
Brian Goff
d809dff289 Refactor exec interface (#578)
This removes the dependence on remotecommand in providers as well as the
need to expose provider ID's for the sake of the ExecInContainer API.
2019-04-26 12:57:56 -07:00
Brian Goff
449eb3bb7d Fix exec parameter parsing (#580)
Exec seems to be broken by ad6fbba806
This change basically copies what's in remotecommand.NewOptions, just
without the logging.
2019-04-25 15:51:53 -07:00
Sargun Dhillon
3da9b0d105 Stop using deprecated method Clientset.Coordination (#581)
Clientset.Coordination is deprecated. We are meant to use the specific
version of the client: CoordinationV1beta1. Clientset.Coordination is
going to be removed in future versions of the client API.
2019-04-23 14:17:15 -07:00
Jeremy Rickard
45d2ef06b2 Update ACI liveness/readiness probe handling to work with named ports (#333)
* Update ACI liveness/readiness probe handling to work with named ports
2019-04-23 11:43:48 -07:00
Brian Goff
ceb9b16c5c Don't set cancel function to nil on error (#579)
When setting up the http server we return a cancel function to close all
the listeners down.
The issue here is we set the cancel function to nil and thereby cause a
panic when there is an error and the `defer` attempts to call cancel.

This fix just don't set a named return value for the cancel function to
make sure we don't overwrite it with a `return nil, err`.
This ensures that the `defer` can still call `cancel()`.
2019-04-22 10:31:06 -07:00
Brian Goff
8d0b843ae4 Refactor CLI initialization (#562)
This cleans up the CLI code significantly.
Also makes some of this re-usable for providers who want to do so.

This also removes the main.go from the top of the tree of the repro,
instead moving it into cmd/virtual-kubelet.
This allows us to better utilize the package namespace (and e.g. mv the
`vkubelet` package to the top of the tree).
2019-04-19 17:02:39 -07:00
Hector Fernandez
d3f13cc6ff providers: fix string format (#575) 2019-04-19 08:10:03 -07:00
Yash Desai
de32752395 Set container env var using services. (#573)
* Introduce service env vars.
2019-04-17 11:30:39 -07:00
Brian Goff
6cb323eec2 More Makefile enhancements (#569)
Allows us to make use of of make's target deps instead of re-execing
make in our build target just for custom, one-shot environment changes.

Keeps e2e bin in bin/e2e/virtual-kubelet.
2019-04-15 16:03:45 -07:00
Hongbin Lu
2521ec1cce Add documentation for OpenStack provider (#570)
* Add documentation for OpenStack provider

Signed-off-by: Hongbin Lu <hongbin034@gmail.com>

* Add maintainer for OpenStack provider

Signed-off-by: Hongbin Lu <hongbin034@gmail.com>
2019-04-08 14:31:39 -07:00
Brian Goff
686cdb8b36 uncomment skaffold/delete on e2e.clean
This was commented while I was testing and I forgot to uncomment.
v0.9.0
2019-04-04 10:13:07 -07:00
Anubhav Mishra
455b0cc4a6 Adding HashiCorp Nomad homepage links (#567)
* Add nomad link to readme.
2019-04-03 23:08:26 -07:00
Brian Goff
261359d20e Merge pull request #564 from cpuguy83/fix_version_on_node_create
Fix node create after delete
2019-04-03 23:06:02 -07:00
Brian Goff
99c07d487e Fix node create after delete
node.ResourceVersion must not be set when creating a node.
This issue prevents vk from resolving issues after the vk node instance
has been deleted (for whatever reason).
2019-04-03 22:57:11 -07:00
Brian Goff
af06b005b2 Fix some cases with e2e targets
The e2e targets were not setup correctly preventing some variables from
being set.
2019-04-03 22:57:11 -07:00
Ria Bhatia
9dc78bd4d3 adding virtual kubelet 2019 roadmap (#473)
* adding doc folder

* initial draft

* initial draft-2

* initial draft-3

* initial draft-4

* adding testing and use cases

* format
2019-04-03 14:09:40 -07:00
Yash Desai
750de3195d Resource manager: add service lister and remove unused lock. (#559)
* Remove unused lock from the resource manager.

* Add service lister to the resource manager.

This change adds a service lister in the
resource manager.
This will be used to set the service env vars.
Also added a List method to the resource manager
and a simple test to confirm it's a pass through.
2019-04-03 11:19:30 -07:00
Yash Desai
85292ef4ef Patch the node status instead of updating it. (#557)
* Patch the node status instead of updating it.

Virtual-kubelet updates the node status periodically.
This change proposes we use the `Patch` API instead of `Update`,
to update the node status.
This avoids overwriting any node updates made by other controllers
in the system, for example a attach-detach controller.
Patch API does a strategic merge instead of overwriting
the entire object, which ensures parallel updates don't overwrite
each other.

Note: `PatchNodeStatus` reduces the time precision to the seconds-level
and therefore I corrected the test for this.

consider two controllers:
CONTROLLER 1 (virtual kubelet)                       | CONTROLLER 2
oldNode := nodes.Get(nodename)                       |
                                                     | node := nodes.Get(nodename)
                                                     | // update status with attached volumes info
                                                     | updateNode := Nodes.UpdateStatus(node)
// update vkubelet info on node status               |
latestNode := Nodes.UpdateStatus(oldNode)            |
<-- latestNode does not contain the volume info added by second controller.

with my patch change:

CONTROLLER 1 (virtual kubelet)                       | CONTROLLER 2
oldNode := Nodes.Get(nodename)                       |
                                                     | node := Nodes.Get(nodename)
                                                     | // update status with attached volumes info
                                                     | updateNode := Nodes.UpdateStatus(node)
node := oldNode.DeepCopy()                           |
// update vkubelet info on node status               |
latestNode := util.PatchNodeStatus(oldNode, node)    |
<-- latestNode contains the volume info added by second controller.

Testing Done: make test

* Introduce PatchNodeStatus into vkubelet.

* Pass only the node interface.
2019-04-03 10:40:57 -07:00
Vipin Duleb
bab9c59ac8 GPU support in ACI provider (#563)
* GPU support in ACI provider
2019-04-02 18:11:35 -07:00