[... previous content remains the same ...]
- Kubernetes cluster
- kubectl configured to communicate with your cluster
- Helm (optional, for managing Kubernetes applications)
-
Create a Kubernetes secret for your environment variables:
kubectl create secret generic llm-microservice-secrets --from-env-file=.env
-
Apply the Kubernetes manifests:
kubectl apply -f k8s/
This assumes you have Kubernetes manifest files (Deployment, Service, etc.) in a
k8s/
directory. -
Verify the deployment:
kubectl get pods kubectl get services
-
(Optional) If using Helm:
helm install llm-microservice ./helm-charts/llm-microservice
To automatically scale your deployment based on CPU usage:
-
Create an HPA:
kubectl autoscale deployment llm-microservice --cpu-percent=70 --min=2 --max=10
-
Verify the HPA:
kubectl get hpa
-
Deploy Prometheus for monitoring:
helm install prometheus stable/prometheus
-
Deploy Grafana for visualization:
helm install grafana stable/grafana
-
Import the provided Grafana dashboard for LLM-specific metrics.
-
Check pod logs:
kubectl logs <pod-name>
-
Exec into a pod:
kubectl exec -it <pod-name> -- /bin/bash
-
Check events:
kubectl get events --sort-by=.metadata.creationTimestamp
Remember to always test your deployment in a staging environment before applying changes to production.