Releases: jjleng/paka
Releases · jjleng/paka
v0.1.11
What's Changed
- feat: change the default pack builder to builder-jammy-base by @jjleng in #127
- chore: version bump up by @jjleng in #128
Full Changelog: v0.1.10...v0.1.11
What's Changed
- feat: change the default pack builder to builder-jammy-base by @jjleng in #127
- chore: version bump up by @jjleng in #128
Full Changelog: v0.1.10...v0.1.11
v0.1.10
v0.1.9
What's Changed
- feat: options to set env, volume mounts and probes for model group containers by @jjleng in #101
- feat: options to pass in env vars for functions by @jjleng in #102
- feat: be able to display public and private endpoints for MG by @jjleng in #103
- chore: upgrade prometheus chart by @jjleng in #104
- feat: default to openai compatible server for vllm by @jjleng in #105
- feat: run vLLM with a served model name by @jjleng in #106
- fix: make prometheus scrape the inferrence engine's metrics correctly by @jjleng in #107
- feat: set knative resource requests by @jjleng in #108
- feat: dedicated node groups for functions and job workers by @jjleng in #109
- feat: be able to pass resource requests when creating jobs by @jjleng in #110
- feat: detect the right ami types for different node types by @jjleng in #111
- feat: command to push a prebuilt image to the container registry by @jjleng in #112
- refactor: make the config fields naming consistent by @jjleng in #113
- feat: enable host ipc for multiple GPU inference by @jjleng in #114
- docs: rewrite README by @jjleng in #115
- docs: add sample model launching templates by @jjleng in #116
- docs: updates to the invoice_extraction example by @jjleng in #117
- docs: update the website_rag example by @jjleng in #118
- feat: gateway vpc endpoint for s3 by @jjleng in #119
- feat: default to have at least one funciton instance running by @jjleng in #120
- docs: add faq by @jjleng in #121
- docs: detailed descriptions of the cluster config yaml by @jjleng in #122
- docs: quick start doc by @jjleng in #123
- chore: version bump by @jjleng in #124
Full Changelog: v0.1.8...v0.1.9
v0.1.8
What's Changed
- feat: support inference with spot instances by @jjleng in #86
- feat: support mixed model groups by @jjleng in #87
- feat: flag to make model groups not exposed to the public by @jjleng in #88
- feat: heuristics for caculating the memory, cpu and gpu resource requests by @jjleng in #89
- fix: typing issues in e2e test code by @jjleng in #90
- feat: command to list function revisions by @jjleng in #91
- feat: command to split traffic among revisions by @jjleng in #92
- refactor: make typer command args more consistent by @jjleng in #93
- refactor: save kubeconfig as soon as cluster is provisioned by @jjleng in #94
- feat: options to pass resource requests and limits when creating fuctions by @jjleng in #95
- feat: utilize multiple GPUs for vllm inferrence by @jjleng in #96
- feat: utility to parse gguf metadata by @jjleng in #97
- fix: permissions to allow cluster autoscaler scale from 0 by @jjleng in #98
- chore: bump up version by @jjleng in #99
- chore: add homepage info to pyproject by @jjleng in #100
Full Changelog: v0.1.7...v0.1.8
v0.1.7
What's Changed
- refactor: windows compatibility by @jjleng in #68
- feat: install kubectl as a dependency for pulumi by @jjleng in #69
- docs: aws CLI is required by @jjleng in #70
- tests: parse the cluster config yamls of examples by @jjleng in #71
- refactor: harden the config code and files by @jjleng in #72
- feat: cleanup stabled model group resources by @jjleng in #73
- refactor: change opt-in to opt-out for kubeconfig update by @jjleng in #74
- feat: command to list clusters being managed by @jjleng in #75
- feat: command to switch current cluster by @jjleng in #76
- feat: remove local cluster state when a cluster is taken down by @jjleng in #77
- refactor: get rid of local cluster state by @jjleng in #78
- fix: make removing crd finalizers more stable by @jjleng in #79
- feat: turn on flash attention for llama.cpp by default by @jjleng in #80
- fix: fix misc issues by @jjleng in #81
- fix: reattempt to delete CRD finalizers in a separate thread by @jjleng in #82
- feat: support vLLM as inference runtime by @jjleng in #83
- feat: check for config version forward compatibility by @jjleng in #84
- chore: bump up version by @jjleng in #85
Full Changelog: v0.1.6...v0.1.7
v0.1.6
What's Changed
- refactor: restructured code base by @jjleng in #51
- fix: print messages after closing progress bar by @jjleng in #52
- refactor: get rid of hard coded model mount path in containers by @jjleng in #53
- feat: remove k8s resources related to a staled model group by @jjleng in #54
- fix: make llama.cpp save hf files to addressable local paths by @jjleng in #55
- fix: the hf model file path problem of the llama.cpp runtime by @jjleng in #56
- tests: e2e tests with KinD by @jjleng in #57
- ci: add e2e to ci by @jjleng in #58
- refactor: make pack be installed to a non-global location by @jjleng in #59
- docs: get rid of the pack installation info from readme files by @jjleng in #60
- feat: avoid mutating the model stored in model store by @jjleng in #61
- refactor: remove aws CLI as a dependency when pushing images to ECR by @jjleng in #62
- refactor: make the code removing k8s finalizers more robust when taki… by @jjleng in #63
- refactor: make docker login more secure by @jjleng in #64
- Isolate paka's pulumi from system pulumi by @jjleng in #65
- docs: doc updates by @jjleng in #66
- chore: version bump by @jjleng in #67
Full Changelog: v0.1.5...v0.1.6
v0.1.5
What's Changed
- feat: add a new model registry by @jjleng in #42
- test: fix test_registry by @jjleng in #44
- Add new workflow. by @erika-tsay in #45
- refactor: remove runtime and devices fields from model class by @jjleng in #46
- feat: llama.cpp runtime that supports the new model abstraction by @jjleng in #47
- refactor: make default llama.cpp params more conservative by @jjleng in #48
- fix: correct the wrong dependency in the invoice example by @jjleng in #49
- chore: bump version by @jjleng in #50
New Contributors
- @erika-tsay made their first contribution in #45
Full Changelog: v0.1.4...v0.1.5
v0.1.4
What's Changed
- chore: fix mypy errors by @jjleng in #37
- test: replace patch with patch.object for eaiser error discovery by @jjleng in #38
- chore: pin pulumi to a specific version that honors depends_on by @jjleng in #39
- feat: command to sync cluster state by @jjleng in #40
- chore: bump version by @jjleng in #41
Full Changelog: v0.1.3...v0.1.4
v0.1.3
What's Changed
- docs: instructions for installing the pack CLI by @jjleng in #22
- docs: add pulumi CLI as a dependency by @jjleng in #23
- Add HuggingFaceModel, HttpSouceModel by @SoftewareArtist in #24
- fix: fix failing hf model tests by @jjleng in #32
- Refactor the model and model store abstraction by @jjleng in #33
- chore: make paka compatible with py3.8 and above by @jjleng in #34
- chore: remove CHANGELOG.md by @jjleng in #35
- chore: bump version by @jjleng in #36
Full Changelog: v0.1.2...v0.1.3
v0.1.2
What's Changed
- docs: example for invoice extraction by @jjleng in #13
- Add GPU (CUDA) support by @jjleng in #15
- docs: update README with the GPU support message by @jjleng in #16
- docs + GPU inference for the invoice extraction example by @jjleng in #17
- feat: remove finalizers before tearing down a cluster by @jjleng in #18
- chore: bump version to 0.1.2 by @jjleng in #19
Full Changelog: v0.1.1...v0.1.2