virtual-kubelet

Author	SHA1	Message	Date
Brian Goff	bb9ff1adf3	Adds Done() and Err() to pod controller (#735 ) Allows callers to wait for pod controller exit in addition to readiness. This means the caller does not have to deal handling errors from the pod controller running in a gorutine since it can wait for exit via `Done()` and check the error with `Err()`	2019-09-10 17:44:19 +01:00
Brian Goff	db146a0e01	Merge pull request #761 from sargun/cache-deps Cache Downloaded Go Modules	2019-09-06 15:20:37 -07:00
Ernest Wong	fdb0c805f7	Add more unit test to #584	2019-09-05 10:48:35 -07:00
Ernest Wong	dc7ff44303	Add unit tests for #584	2019-09-05 09:49:41 -07:00
Sargun Dhillon	e7a36c3505	Cache Downloaded Go Modules This caches the downloaded go modules. It invalidates them based on a hash of the go.mod, and go.sum. The test step showed a reduction from 1:30 -> 1:00, and the e2e tests from 8:30 to 5 minutes.	2019-09-05 09:23:13 -07:00
Ernest Wong	f10a16aed7	Importable End-To-End Test Suite (#758 ) * Rename VK to chewong for development purpose * Rename basic_test.go to basic.go * Add e2e.go and suite.go * Disable tests in node.go * End to end tests are now importable as a testing suite * Remove 'test' from test files * Add documentations * Rename chewong back to virtual-kubelet * Change 'Testing Suite' to 'Test Suite' * Add the ability to skip certain testss * Add unit tests for suite.go * Add README.md for importable e2e test suite * VK implementation has to be based on VK v1.0.0 * Stricter checks on validating test functions * Move certain files back to internal folder * Add WatchTimeout as a config field * Add slight modifications	2019-09-04 22:25:43 +01:00
Sargun Dhillon	da57373abb	Test pods going missing while they're running in legacy providers (#759 ) We poll legacy providers for their pod(s) status periodically. This is because we have no way of knowing when the pod is updated. If the pod somehow goes missing in the provider, that state must be handled. Currently, we update API server, and mark the pod as failed, or ignore it.	2019-09-04 22:16:14 +01:00
Sargun Dhillon	33df981904	Have NotifyPods store the pod status in a map (#751 ) We introduce a map that can be used to store the pod status. In this, we do not need to call GetPodStatus immediately after NotifyPods is called. Instead, we stash the pod passed via notifypods as in a map we can access later. In addition to this, for legacy providers, the logic to merge the pod, and the pod status is hoisted up to the loop. It prevents leaks by deleting the entry in the map as soon as the pod is deleted from k8s.	2019-09-04 20:14:34 +01:00
Brian Goff	ecf6e45bfc	Merge pull request #755 from sargun/fix-golang-lint Fix golang lint	2019-09-03 11:25:21 -07:00
Sargun Dhillon	3f85705461	Upgrade linter, and move away from incremental linting Incremental linting doesn't seem to catch issues correctly. This runs the linters in a more standard way.	2019-09-03 11:00:33 -07:00
Sargun Dhillon	7133a372d6	Mark current linting errors as non-errors This is basically claiming linting bankruptcy. It marks all of the issues we had up until this point as nolint.	2019-09-03 11:00:33 -07:00
Sargun Dhillon	5949e6279d	Miscellaneous cleanup for linting	2019-09-03 11:00:33 -07:00
Sargun Dhillon	9cce8640a5	Fix linting errors in node/pod_test.go This moves away from defining pods independently. It moves pod (spec) generation to an independent function.	2019-09-03 11:00:33 -07:00
Sargun Dhillon	7accddcaf4	Fix linting errors in node/podcontroller.go	2019-09-03 11:00:33 -07:00
Ernest Wong	ee31118596	Update docs on virtual-kubelet.io (#754 ) * Update website content * Add PodLifecycleHandler	2019-09-03 10:52:23 -07:00
Brian Goff	2507f57f97	Merge pull request #732 from sargun/move-around-reactor Move location of eventhandler registration	2019-09-03 10:44:52 -07:00
Sargun Dhillon	9a461a61ad	Bump the Circle CI build job to an resource_class of xlarge (#722 )	2019-09-02 07:11:11 +01:00
Sargun Dhillon	9443e32ae7	Merge pull request #742 from sargun/fix-mock-provider Fix mock_test DeletePod to store updated pod status	2019-08-25 10:52:56 -07:00
Sargun Dhillon	43ee086360	Fix mock_test DeletePod to store updated pod status	2019-08-25 10:42:35 -07:00
Sargun Dhillon	0c6de30684	Merge pull request #746 from 928234269/patch2 fix tyop in doc.go	2019-08-21 08:29:46 -07:00
928234269	7305c08d7e	fix tyop in doc.go Signed-off-by: 928234269 <longfei.shang@daocloud.io>	2019-08-20 18:44:11 +08:00
Sargun Dhillon	ccb6713b86	Move location of eventhandler registration This moves the event handler registration until after the cache is in-sync. It makes it so we can use the log object from the context, rather than having to use the global logger The cache race condition of the cache starting while the reactor is being added wont exist because we wait for the cache to startup / go in sync prior to adding it.	2019-08-18 08:20:49 -07:00
Brian Goff	2f2625c8e2	Merge pull request #734 from sargun/do-not-change-pods Do not mutate pods, nor hand off pod references to provider	2019-08-15 10:58:39 -07:00
Sargun Dhillon	69f1186713	Do not mutate pods, nor hand off pod references to provider This moves to a model where any time that pods are given to a provider, it uses a DeepCopy, as opposed to a reference. If the provider mutates the pod, it prevents it from causing issues with the informer cache. It has to use reflect instead of comparing the hashes because spew prints DeepCopy'd data structures ever so slightly differently.	2019-08-15 09:59:01 -07:00
Sargun Dhillon	89d88a17ed	Add a generic reactor to lifecycle_test to bump resource version (#733 ) All updates in our tests should have the behaviour that best reflects what API server does.	2019-08-15 08:46:38 +01:00
Brian Goff	cad19238fd	Merge pull request #736 from sargun/fix-race Wait for the informer to become in sync before starting tests	2019-08-14 11:44:21 -07:00
Sargun Dhillon	bc2f6e0dc4	Wait for the informer to become in sync before starting tests If the informers are starting at the same time as createPods, then we can get into a situation where the pod seems to get "lost". Instead, we wait for the informer to get into sync prior to the createpod event. This also moves to one informer as a microoptimization in the tests.	2019-08-14 07:03:53 -07:00
Brian Goff	47f5aa45df	Merge pull request #727 from ethan-daocloud/patch-2 cleanup: fix some typos in node.go	2019-08-13 12:00:43 -07:00
Sargun Dhillon	de238ee280	Merge pull request #731 from sargun/document-api Add documentation to the provider API about concurrency / mutability	2019-08-13 11:58:00 -07:00
Brian Goff	569706f371	Merge branch 'master' into document-api	2019-08-13 11:47:04 -07:00
Guangming Wang	cb307df71e	cleanup: fix some typos in node.go Signed-off-by: Guangming Wang <guangming.wang@daocloud.io>	2019-08-13 11:39:00 -07:00
Sargun Dhillon	40a4b54ca7	Merge pull request #728 from sargun/im-an-idiot Remove usage of atomics in tests	2019-08-13 11:34:55 -07:00
Sargun Dhillon	edc0991c0c	Fix hotloop around scheduling in lifecycle_test Lifecycle test had a hotloop, where it would run a never-yielding function while processing was going on elsewhere. This inserts a sleep. A sleep is used rather than a yield to be kind to people's battery life.	2019-08-13 11:25:21 -07:00
Sargun Dhillon	fbed4ca702	Remove usage of atomics It turns out that running atomic.Read(...) in a tight loop breaks Golang. The goroutine would never yield control over the scheduler, so we ended up getting into a situation where the test would get stuck forever. This moves to a different model, in which there is a condition var, instead of atomics in loops.	2019-08-13 11:25:21 -07:00
Sargun Dhillon	9b27eb83fe	Make mock_test follow the aformentioned documentation	2019-08-13 10:30:02 -07:00
Sargun Dhillon	3b3bf3ff20	Add documentation to the provider API about concurrency / mutability This adds documentation around what is allowed to be mutated and what may be accessed concurrently from the provider API. Previously, the API was ambigious, and that meant providers could return pods and change them. This resulted in data races occuring.	2019-08-13 10:29:12 -07:00
Sargun Dhillon	75a399f6f4	Merge pull request #724 from sargun/upgrade-k8s-v2 Upgrade k8s	2019-08-13 03:08:37 -07:00
Pires	f0a0e8cbfe	Merge branch 'master' into upgrade-k8s-v2	2019-08-13 10:43:00 +01:00
Sargun Dhillon	32ff40eb56	Merge pull request #720 from sargun/set-test-timeout Set timeout for tests on CI to 9 minutes	2019-08-12 14:53:09 -07:00
Sargun Dhillon	65c5446c94	Set timeout for tests on CI to 9 minutes Right now, if the tests get stuck (on CI), they are terminated after 10 minutes. This means as well that we get 0 output about what went wrong. Instead, this triggers a panic after 9 minutes on CI.	2019-08-12 13:45:30 -07:00
Brian Goff	cafcdeeefa	Merge pull request #723 from sargun/lifecycle-test-fixes Array of minor fixups to lifecycle tests	2019-08-12 13:22:51 -07:00
Sargun Dhillon	5c2b682cdc	Array of minor fixups to lifecycle tests * Fix the deletion test to actually test the pod is deleted * Fix the update pods test to update a value which is allowed to be updated * Shut down watches after tests * Do not delete pod statuses on DeletePod in mock_test This intentionally leaks pod statuses, but it makes the situation a lot less complicated around handling race conditions with the GetPodStatus callback	2019-08-12 12:10:29 -07:00
Sargun Dhillon	e1c3bc3151	Merge pull request #725 from sargun/fix-race-conditions-in-node-test Fix race conditions in node_test	2019-08-12 11:43:06 -07:00
Sargun Dhillon	5ac33e4b0a	Fix race conditions in node_test	2019-08-12 11:33:48 -07:00
Sargun Dhillon	42656aae2f	Merge pull request #719 from ethan-daocloud/patch-1 cleanup: fix misspelled words in error message	2019-08-12 11:09:35 -07:00
Brian Goff	10b291dba1	Merge branch 'master' into patch-1	2019-08-12 10:48:15 -07:00
Brian Goff	9d90c599e7	Merge pull request #721 from sargun/fix-race-condition Fix race condition around worker ID generation in podcontroller.go	2019-08-12 10:43:32 -07:00
Sargun Dhillon	82de7f02c4	Upgrade Kubernetes e2e test cluster to 1.15.2	2019-08-12 10:30:04 -07:00
Sargun Dhillon	ad6cd7d552	Upgrade K8s * Upgrade k8s.io/api go get k8s.io/api@kubernetes-1.15.2 * Upgrade k8s.io/apimachinery go get k8s.io/apimachinery@kubernetes-1.15.2 * Upgrade kubernetes-1.15.2 go get k8s.io/client-go@kubernetes-1.15.2 * Upgrade kk8s.io/kubernetes to v1.15.2 go get k8s.io/kubernetes@v1.15.2 This also locks the the dependency for github.com/prometheus/client_golang/prometheus due to a golang bug, and to please the validation scripts. The replaces were generated by: go get k8s.io/kubernetes@v1.15.2 2> fail for i in $(cat fail\|grep unknown\|cut -f1 -d@\|cut -f2 -d" ") do echo "replace ${i} => ${i} kubernetes-1.15.2" done	2019-08-12 10:29:19 -07:00
Sargun Dhillon	a28969355e	Fix race condition around worker ID generation in podcontroller.go	2019-08-12 10:27:21 -07:00

... 2 3 4 5 6 ...

814 Commits