Skip to content

Commit

Permalink
kubeai: enableMonitoring toggle for Prometheus monitoring of vLLM engine
Browse files Browse the repository at this point in the history
  • Loading branch information
AdamNowotny committed Feb 15, 2025
1 parent fc7ee5c commit 55b72bb
Show file tree
Hide file tree
Showing 4 changed files with 26 additions and 0 deletions.
1 change: 1 addition & 0 deletions Pulumi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ config:
kubeai:enabled: false
kubeai:hostname: kubeai
kubeai:preferredNodeLabel: orangelab/kubeai
kubeai:enableMonitoring: false
# Set token to access gated repos
# kubeai:huggingfaceToken: ''
# Comma-separated list of models from https://github.com/substratusai/kubeai/blob/main/charts/models/values.yaml
Expand Down
10 changes: 10 additions & 0 deletions components/ai/AI.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,16 @@ pulumi config set --secret kubeai:huggingfaceToken <hf_token>
pulumi up
```

Prometheus monitoring for vLLM can be enabled with:

```sh
pulumi set kubeai:enableMonitoring true
```

Note Prometheus has to be installed first before changing that switch.

Sample Grafana dashboard to import - https://github.com/substratusai/kubeai/blob/main/examples/observability/vllm-grafana-dashboard.json

### Visual Studio Code

Similar to Ollama, you can configure Continue to use KubeAi/vLLM. Depending on the model, max tokens or context length will have to be adjusted.
Expand Down
11 changes: 11 additions & 0 deletions components/ai/kubeai.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ export class KubeAi extends pulumi.ComponentResource {

const config = new pulumi.Config(name);
const version = config.get('version');
const enableMonitoring = config.getBoolean('enableMonitoring');
const hostname = config.require('hostname');
const huggingfaceToken = config.getSecret('huggingfaceToken');
const models = config.get('models')?.split(',') ?? [];
Expand Down Expand Up @@ -43,6 +44,16 @@ export class KubeAi extends pulumi.ComponentResource {
],
tls: [{ hosts: [hostname] }],
},
metrics: enableMonitoring
? {
prometheusOperator: {
vLLMPodMonitor: {
enabled: true,
labels: {},
},
},
}
: undefined,
modelAutoscaling: { timeWindow: '30m' },
modelServerPods: {
// required for NVidia detection
Expand Down
4 changes: 4 additions & 0 deletions components/monitoring/prometheus.ts
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,10 @@ export class Prometheus extends pulumi.ComponentResource {
hostname: prometheusHostname,
},
prometheusSpec: {
podMonitorSelectorNilUsesHelmValues: false,
probeSelectorNilUsesHelmValues: false,
ruleSelectorNilUsesHelmValues: false,
serviceMonitorSelectorNilUsesHelmValues: false,
storageSpec: {
volumeClaimTemplate: {
spec: {
Expand Down

0 comments on commit 55b72bb

Please sign in to comment.