Commit Graph

5 Commits

Author SHA1 Message Date
Sargun Dhillon
735eb34829 This adds the v1 lease controller
This refactors the v1 lease controller. It makes two functional differences
to the lease controller:
* It no longer ties lease updates to node pings or node status updates
* There is no fallback mechanism to status updates

This also moves vk_envtest, allowing for future brown-box testing of the
lease controller with envtest
2021-01-05 11:40:44 -08:00
Sargun Dhillon
11c63bca6f Refactor the way that the that node_ping_controller works
This moves node ping controller to using the new internal lock
API.

The reason for this is twofold:
* The channel approach that was used to notify other
  controllers of changes could only be used once (at startup),
  and couldn't be used in the future to broadcast node
  ping status. The idea idea is here that we could move
  to a sync.Cond style API and only wakeup other controllers
  on change, as opposed to constantly polling each other
* The problem with sync.Cond is that it's not context friendly.
  If we want to do stuff like wait on a sync.cond and use a context
  or a timer or similar, it doesn't work whereas this API allows
  context cancellations on condition change.

The idea is that as we have more controllers that act as centralized
sources of authority, they can broadcast out their state.
2020-12-03 11:40:01 -08:00
Sargun Dhillon
cf2d5264a5 Fix datarace in node ping controller 2020-09-21 23:38:43 -07:00
Sargun Dhillon
cd059d9755 Fix node ping interval code / default setting code
Change the place where we set the defaults for node ping
and node status interval. This problem manifested itself
by the node ping interval being 0 when it was set to
the default.

This makes two changes:
1. Invalid ping values, and ping timeouts will not
   allow VK to start up
2. We set the default values very early on in creation
   of the node controller -- where all the other values
   are set.

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-08-18 00:39:14 -07:00
Sargun Dhillon
d390dfce43 Move node pinging to its own goroutine
This moves the job of pinging the node provider into its own
goroutine. If it takes a long time, it shouldn't slow down
leases, and vice-versa.

It also adds timeouts for node pings. One of the problems
is that we don't know how long a node ping will take --
there could be a bunch of network calls underneath us.

The point of the lease is to say whether or not the
Kubelet is unreachable, not whether or not the node
pings are "passing".

Signed-off-by: Sargun Dhillon <sargun@sargun.me>
2020-08-03 10:57:37 -07:00