Release v0.1.8 · jjleng/paka

What's Changed

feat: support inference with spot instances by @jjleng in #86
feat: support mixed model groups by @jjleng in #87
feat: flag to make model groups not exposed to the public by @jjleng in #88
feat: heuristics for caculating the memory, cpu and gpu resource requests by @jjleng in #89
fix: typing issues in e2e test code by @jjleng in #90
feat: command to list function revisions by @jjleng in #91
feat: command to split traffic among revisions by @jjleng in #92
refactor: make typer command args more consistent by @jjleng in #93
refactor: save kubeconfig as soon as cluster is provisioned by @jjleng in #94
feat: options to pass resource requests and limits when creating fuctions by @jjleng in #95
feat: utilize multiple GPUs for vllm inferrence by @jjleng in #96
feat: utility to parse gguf metadata by @jjleng in #97
fix: permissions to allow cluster autoscaler scale from 0 by @jjleng in #98
chore: bump up version by @jjleng in #99
chore: add homepage info to pyproject by @jjleng in #100

Full Changelog: v0.1.7...v0.1.8