Skip to content

Commit

Permalink
[castai-hosted-model] add node placement job
Browse files Browse the repository at this point in the history
  • Loading branch information
Zilv1nas committed Mar 3, 2025
1 parent f4112bb commit b724769
Show file tree
Hide file tree
Showing 4 changed files with 83 additions and 3 deletions.
2 changes: 1 addition & 1 deletion charts/castai-hosted-model/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ apiVersion: v2
name: castai-hosted-model
description: CAST AI hosted model deployment chart.
type: application
version: 0.0.7
version: 0.0.8
appVersion: "v0.0.1"
dependencies:
- name: ollama
Expand Down
10 changes: 8 additions & 2 deletions charts/castai-hosted-model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,11 @@ CAST AI hosted model deployment chart.

| Key | Type | Default | Description |
|-----|------|---------|-------------|
| ollama.enabled | bool | `true` | |
| vllm.enabled | bool | `false` | |
| ollama.enabled | bool | `true` | Specifies if Ollama model should be deployed |
| placementJob.enabled | bool | `false` | Specifies if a node placement job should be deployed |
| placementJob.image.pullPolicy | string | `"IfNotPresent"` | Image pull policy |
| placementJob.image.repository | string | `"busybox"` | The image to use for the job |
| placementJob.image.tag | string | `"1.37.0"` | The image tag |
| placementJob.requiredGPUTotalMemoryMiB | string | `nil` | Total GPU memory MiB (GPU count * GPU memory MiB) of the node that should be provisioned for this job |
| placementJob.resources | object | `{}` | Resources for the job |
| vllm.enabled | bool | `false` | Specifies if vLLM model should be deployed |
55 changes: 55 additions & 0 deletions charts/castai-hosted-model/templates/placement-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{{- if .Values.placementJob.enabled }}
{{- $nodeTemplateName := required "placementJob.nodeTemplateName is required" .Values.placementJob.nodeTemplateName }}
apiVersion: batch/v1
kind: Job
metadata:
name: {{ .Release.Name }}-placement-job
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: placement-job
spec:
backoffLimit: 0
template:
metadata:
labels:
app.kubernetes.io/name: {{ .Chart.Name }}
app.kubernetes.io/instance: {{ .Release.Name }}
app.kubernetes.io/component: placement-job
spec:
restartPolicy: Never
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nvidia.com/gpu.total-memory
operator: Gt
values:
- "{{ required "placementJob.requiredGPUTotalMemoryMiB is required" .Values.placementJob.requiredGPUTotalMemoryMiB }}"
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: scheduling.cast.ai/spot
operator: Exists
containers:
- name: placement-job
image: "{{ .Values.placementJob.image.repository }}:{{ .Values.placementJob.image.tag }}"
imagePullPolicy: {{ .Values.placementJob.image.pullPolicy | quote }}
command: ["/bin/sh", "-c", "echo Node placement job finished."]
resources:
{{- toYaml .Values.routerResources | nindent 12 }}
nodeSelector:
scheduling.cast.ai/node-template: "{{ $nodeTemplateName }}"
tolerations:
- key: scheduling.cast.ai/node-template
value: "{{ $nodeTemplateName }}"
operator: Equal
effect: NoSchedule
- key: scheduling.cast.ai/spot
operator: Exists
- key: nvidia.com/gpu
effect: NoSchedule
operator: Exists
{{- end }}
19 changes: 19 additions & 0 deletions charts/castai-hosted-model/values.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,23 @@
ollama:
# -- Specifies if Ollama model should be deployed
enabled: true
vllm:
# -- Specifies if vLLM model should be deployed
enabled: false
placementJob:
# -- Specifies if a node placement job should be deployed
enabled: false

# -- Total GPU memory MiB (GPU count * GPU memory MiB) of the node that should be provisioned for this job
requiredGPUTotalMemoryMiB:

image:
# -- The image to use for the job
repository: busybox
# -- The image tag
tag: "1.37.0"
# -- Image pull policy
pullPolicy: IfNotPresent

# -- Resources for the job
resources: {}

Check failure on line 23 in charts/castai-hosted-model/values.yaml

View workflow job for this annotation

GitHub Actions / lint-test

23:16 [new-line-at-end-of-file] no new line character at the end of file

0 comments on commit b724769

Please sign in to comment.