Create a provider to use Azure Batch (#133)
* Started work on provider * WIP Adding batch provider * Working basic call into pool client. Need to parameterize the baseurl * Fixed job creation by manipulating the content-type * WIP Kicking off containers. Dirty * [wip] More meat around scheduling simple containers. * Working on basic task wrapper to co-schedule pods * WIP on task wrapper * WIP * Working pod minimal wrapper for batch * Integrate pod template code into provider * Cleaning up * Move to docker without gpu * WIP batch integration * partially working * Working logs * Tidy code * WIP: Testing and readme * Added readme and terraform deployment for GPU Azure Batch pool. * Update to enable low priority nodes for gpu * Fix log formatting bug. Return node logs when container not yet started * Moved to golang v1.10 * Fix cri test * Fix up minor docs Issue. Add provider to readme. Add var for vk image.
This commit is contained in:
committed by
Robbie Zhang
parent
1ad6fb434e
commit
d6e8b3daf7
19
providers/azurebatch/deployment/examplegpupod.yaml
Normal file
19
providers/azurebatch/deployment/examplegpupod.yaml
Normal file
@@ -0,0 +1,19 @@
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: cuda-vector-add
|
||||
labels:
|
||||
app: examplegpupod
|
||||
spec:
|
||||
restartPolicy: OnFailure
|
||||
containers:
|
||||
- name: cuda-vector-add
|
||||
# https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile
|
||||
image: "k8s.gcr.io/cuda-vector-add:v0.1"
|
||||
resources:
|
||||
limits:
|
||||
nvidia.com/gpu: 1 # requesting 1 GPU
|
||||
nodeName: virtual-kubelet
|
||||
tolerations:
|
||||
- key: azure.com/batch
|
||||
effect: NoSchedule
|
||||
Reference in New Issue
Block a user