VMware vSphere Integrated Containers provider (#206)

* Add Virtual Kubelet provider for VIC

Initial virtual kubelet provider for VMware VIC.  This provider currently
handles creating and starting of a pod VM via the VIC portlayer and persona
server.  Image store handling via the VIC persona server.  This provider
currently requires the feature/wolfpack branch of VIC.

* Added pod stop and delete.  Also added node capacity.

Added the ability to stop and delete pod VMs via VIC.  Also retrieve
node capacity information from the VCH.

* Cleanup and readme file

Some file clean up and added a Readme.md markdown file for the VIC
provider.

* Cleaned up errors, added function comments, moved operation code

1. Cleaned up error handling.  Set standard for creating errors.
2. Added method prototype comments for all interface functions.
3. Moved PodCreator, PodStarter, PodStopper, and PodDeleter to a new folder.

* Add mocking code and unit tests for podcache, podcreator, and podstarter

Used the unit test framework used in VIC to handle assertions in the provider's
unit test.  Mocking code generated using OSS project mockery, which is compatible
with the testify assertion framework.

* Vendored packages for the VIC provider

Requires feature/wolfpack branch of VIC and a few specific commit sha of
projects used within VIC.

* Implementation of POD Stopper and Deleter unit tests (#4)

* Updated files for initial PR
This commit is contained in:
Loc Nguyen
2018-06-04 15:41:32 -07:00
committed by Ria Bhatia
parent 98a111e8b7
commit 513cebe7b7
6296 changed files with 1123685 additions and 8 deletions

21
vendor/github.com/vmware/vic/doc/design/arch/arch.md generated vendored Normal file
View File

@@ -0,0 +1,21 @@
### vSphere Integrated Containers Architecture
#### Overview
VIC is a product designed to tightly integrate container workflow, lifecycle and provisioning with the vSphere SDDC. In VIC, a container is a hardware-virtualized first-class citizen on the hypervisor provisioned into a _Virtual Container Host_ (VCH) and able to directly integrate with vSphere infrastructure capabilities, such as networking and storage features.
[Learn more about the differences between the VIC model and a traditional software-virtualized container](vic-container-abstraction.md)
The architecture of VIC is designed to allow for significant modularity and flexibility and includes the following key components:
##### Port Layer Abstractions
vSphere currently lacks the notion of container primitives and abstractions through which they can be manipulated. It has a rich API with bindings for various languages (Eg. [govmomi](https://github.com/vmware/govmomi)) but these are all necessarily oriented around the notion of a VM.
While it would be possible to write a rudimentary VIC-like container engine by driving the vSphere APIs directly from within a daemon of some kind, the tight coupling between the low-level vSphere calls and the high-level daemon API would result in very little re-usable code and monolith that's potentially difficult to maintain. An API layer that encapsulates low-level container primitives that is both container engine and operating system agnostic would be preferable.
A secondary benefit of such an API is that it could easily be extended for compatibility with emerging standards which operate at a similar layer, such as [runc](https://github.com/opencontainers/runc).
The Port Layer is designed in such a way that the libraries can be built into static binaries or remotable services. They can be combined together into a single service endpoint or distributed for greater flexibility.
[Learn more about the Port Layer](vic-port-layer-overview.md)

View File

@@ -0,0 +1,4 @@
# self-provisioning proxy
# vmomi proxy agent

View File

@@ -0,0 +1,58 @@
### The VIC Container Abstraction
VIC provisions containers _as_ VMs, rather than _in_ VMs. In understanding the VIC container abstraction, it is helpful to compare and contrast against the virtualization of a traditional container host.
#### Traditional Container Host
Let's take a Linux VM running Docker as an example. The container host is a VM running a Linux OS with the necessary libraries, kernel version and daemon installed. The container host will have a fixed amount of memory and vCPU resource that can be used by the containers provisioned into it.
The container host operating system along with the Docker daemon have to provide the following:
* **The control plane** - an endpoint through which control operations are performed, executing in the same OS as its controlling
* **A container abstraction** - library extensions to the guest OS need to provide a private namespace and resource constraints
* **Network virtualization** - simple bridge networking or overlay networking
* **Layered filesystem** - not an absolute requirement for a container, but typically conflated in most implementations
* **OS Kernel** - a dependency for the container executable to execute on, typically shared between containers
The hypervisor in this mode provides hardware virtualization of the entire container host VM, one or more VMDKs providing local disk for the OS, one or more vNICs to provide network connectivity for the OS and possibly paravirtualization capabilities allowing the containers to directly access hypervisor infrastructure.
#### The VIC model
VIC containers operate quite differently. In the above model, it would be reasonable to describe a container as being run _in_ a VM. In the VIC model, a container is run _as_ a VM. For the purposes of this project, we will refer to this as a _containerVM_.
So what does this mean in practice? Well, firstly a container host isn't a VM, it's a resource pool - this is why we call it a _Virtual_ Container Host. It's an abstract dynamically-configurable resource boundary into which containers can be provisioned. As for the other functions highlighted above:
* **The control plane** - functionally the same endpoint as above, but controlling vSphere and running in its own OS
* **A container abstraction** - is a VM. A VM provides resource constraints and a private namespace, like a container
* **Network virtualization** - provided entirely by vSphere. NSX, distributed port-groups. Each container gets a vNIC
* **Layered filesystem** - provided entirely by vSphere. VMDK snapshots in the initial release
* **OS Kernel** - provided as a minimal ISO from which the containerVM is either booted or forked
In this mode, there is necessarily a 1:1 coupling between a container and a VM. A container image is attached to the VM as a disk, the VM is either booted or forked from the kernel ISO, then the containerVM chroots into the container filesystem effectively becoming the container.
#### Differences
This model leads to some very distinct differences between a VIC container and a traditional container, none of which impact the portability of the container abstraction between these systems, but which are important to understand.
##### Container
1. There is no default shared filesystem between the container and its host
* Volumes are attached to the container as disks and are completely isolated from each other
* A shared filesystem could be provided by something like an NFS volume driver
2. The way that you do low-level management and monitoring of a container is different. There is no VCH shell.
* Any API-level control plane query, such as `docker ps`, works as expected
* Low-level management and monitoring uses exactly the same tools and processes as for a VM
3. The kernel running in the container is not shared with any other container
* This means that there is no such thing as an optional _privileged_ mode. Every container is privileged and fully isolated.
* When a containerVM kernel is forked rather than booted, much of its immutable memory is shared with a parent _template_
4. There is no such thing as unspecified memory or CPU limits
* A Linux container will have access to all of the CPU and memory resource available in its host if not specified
* A containerVM must have memory and CPU limits defined, either derived from a default or specified explicitly
##### Virtual Container Host
A container host in VIC is a _Virtual_ Container Host (VCH). A VCH is not in itself a VM - it is an abstract dynamic resource boundary that is defined and controlled by vSphere into which containerVMs can be provisioned. As such, a VCH can be a subset of a physical host or a subset of a cluster of hosts.
However a container host also represents an API endpoint with an isolated namespace for accessing the control plane, so a functionally equivalent service must be provisioned to the vSphere infrastructure that provides the same endpoint for each VCH. There are various ways in which such an service could be deployed, but the simplest representation is to run it in a VM.
Given that a VCH in many cases will represent a subset of resource from a cluster of physical hosts, it is actually closer in concept to something like Docker Swarm than a traditional container host.
There are also necessarily implementation differences, transparent to the user, which are required to support this abstraction. For example, given that a container is entirely isolated from other containers and its host is just an esoteric resource boundary, any control operations performed within the container - launching processes, streaming stout/stderr, setting environment variables, network specialization - must be done either by modifying the container image disk before it is attached; or through a special control channel embedded in the container (see [Tether](vic-port-layer-overview.md#the-tether-process)).

View File

@@ -0,0 +1,43 @@
#### Port Layer Abstractions
The Port Layer abstractions in VIC are designed to augment the vSphere APIs with low-level container primitives from which a simple container engine could be implemented. The design criteria of the Port Layer is as follows:
* The Port Layer should be primarily oriented around the notion of _isolation domains_. It should provide the means to easily express rich and flexible criteria for isolating containers and their resources, without being explicit about the mechanism through which this should be achieved.
* The Port Layer is designed to be invoked by higher-level software abstraction. It is not designed to be exposed directly to users.
* The Port Layer should be developed as Open Source Software to allow for 3rd party integration
* The Port Layer should be container engine and operating system agnostic
* The Port Layer should be designed in such a way as to optimize control plane performance
* The Port Layer should ensure a single source of truth for all state. Eg. VM power-off == container stop
From an architectural perspective, the Port Layer should be considered functionally equivalent to a project like https://github.com/docker/libcontainer in as much as it provides low-level platform-specific primitives. It is easy to see how such an abstraction could be container engine agnostic since it provides capabilities at a much lower layer. Our goal however is that it should also be operating system agnostic, which is a more challenging goal at such a low layer.
##### Operating System Independence
VMs are already completely operating system agnostic, since they virtualize at the hardware layer and all control plane operations through the vSphere APIs are therefore also necessarily OS agnostic. Guest differences are encapsulated in different builds of "VMware Tools" which is an optional in-guest agent that mediates between the guest and the hypervisor.
The Port Layer in VIC will function in exactly the same way. Control plane operations will be expressed through an OS agnostic API and distinct differences between operating system implementations will be encapuslated in the _Tether_ process that runs in each containerVM.
##### The Tether Process
A traditional container runtime, such as Linux/LXC, allows the control plane and the containers to share a kernel within a common address space. Each container gets its own private namespace, but the shared kernel allows the control plane to have visibility into the containers and also allows for processes to be started and stopped inside them.
A containerVM by contrast uses completely separate isolated kernels for the control plane and containers. The control plane can either run in the hypervisor kernel or in a distinct guest OS kernel in a separate VM, possibly even on a separate physical host. This isolation is by design: the job of a containerVM is to run only the container process in its own kernel with as minimal a guest OS stack as feasibly possible while ensuring the same strong degree of isolation as any other VM. Even the hypervisor doesn't have visibility inside the guest without an in-guest agent installed.
As such, in order for the container control plane to provide a shell into a container, to start and stop processes or to provide monitoring statistics, there must be some kind of guest agent in the containerVM. We call this guest agent a _Tether_ process. This is not the same agent as VMware Tools, but a minimal agent designed specifically for VIC.
The Tether API and Tether codebase is where all OS differences will be encapsulated. As such, the Tether API should be considered private to the Port Layer - it exists exclusively for the benefit of the internal control plane operations, not to be invoked directly by anything that implements the Port Layer.
##### Interoperability
So what kind of container primitives should the Port Layer provide and how are those intended to interoperate with established container standards?
It stands to reason that Networking, Storage and Execution are obvious areas for low-layer primitives. These primitives already exist in the vSphere APIs and the VIC Port Layer is designed to provide a framework which builds on those APIs by providing both plumbing code and opinionated mappings between container concepts and vSphere concepts.
For example, what is a container storage Volume and how should one be configured? The Port Layer API should be responsible for deciding what vSphere construct most appropriately represents a Volume and also that it is configured appropriately. It should do this based on the parameters passed in, the vSphere features currently installed in the system and the resources that the tenant has the authorization to access. It can pass back a handle to that Volume that can then be used in the creation of a container. By doing this, the Port Layer made a choice about the most appropriate underlying representation and it also made sure it was appropriately configured and indexed.
To some extent there is an inevitable overlap with the goals of other projects in this sphere, such as https://github.com/opencontainers/runc. While this is hardly surprising given that the Port Layer is attempting to make opinionated choices in exactly the same problem domain, it would be wrong to infer that this makes it an intentional fragmentation or competing API. It is our explicit intention that the two should be entirely complimentary and that the Port Layer should be the lowest level of abstraction that a VIC implementation of runc would end up calling. If the abstractions are correct, it should be just as possible to build an implementation of https://github.com/coreos/rkt/blob/master/Documentation/app-container.md using the same APIs.