Create a provider to use Azure Batch (#133)

* Started work on provider

* WIP Adding batch provider

* Working basic call into pool client. Need to parameterize the baseurl

* Fixed job creation by manipulating the content-type

* WIP Kicking off containers. Dirty

* [wip] More meat around scheduling simple containers.

* Working on basic task wrapper to co-schedule pods

* WIP on task wrapper

* WIP

* Working pod minimal wrapper for batch

* Integrate pod template code into provider

* Cleaning up

* Move to docker without gpu

* WIP batch integration

* partially working

* Working logs

* Tidy code

* WIP: Testing and readme

* Added readme and terraform deployment for GPU Azure Batch pool.

* Update to enable low priority nodes for gpu

* Fix log formatting bug. Return node logs when container not yet started

* Moved to golang v1.10

* Fix cri test

* Fix up minor docs Issue. Add provider to readme. Add var for vk image.
This commit is contained in:
Lawrence Gripper
2018-06-23 00:33:49 +01:00
committed by Robbie Zhang
parent 1ad6fb434e
commit d6e8b3daf7
75 changed files with 20040 additions and 6 deletions

View File

@@ -0,0 +1,19 @@
apiVersion: v1
kind: Pod
metadata:
name: cuda-vector-add
labels:
app: examplegpupod
spec:
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
# https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
image: "k8s.gcr.io/cuda-vector-add:v0.1"
resources:
limits:
nvidia.com/gpu: 1 # requesting 1 GPU
nodeName: virtual-kubelet
tolerations:
- key: azure.com/batch
effect: NoSchedule