Skip to content

Releases: jjleng/paka

v0.1.11

12 Jul 21:29
Compare
Choose a tag to compare

What's Changed

  • feat: change the default pack builder to builder-jammy-base by @jjleng in #127
  • chore: version bump up by @jjleng in #128

Full Changelog: v0.1.10...v0.1.11

What's Changed

  • feat: change the default pack builder to builder-jammy-base by @jjleng in #127
  • chore: version bump up by @jjleng in #128

Full Changelog: v0.1.10...v0.1.11

v0.1.10

18 Jun 06:21
Compare
Choose a tag to compare

What's Changed

  • refactor: disable image pull always for function containers by @jjleng in #125
  • chore: bump up version by @jjleng in #126

Full Changelog: v0.1.9...v0.1.10

v0.1.9

01 Jun 18:06
Compare
Choose a tag to compare

What's Changed

  • feat: options to set env, volume mounts and probes for model group containers by @jjleng in #101
  • feat: options to pass in env vars for functions by @jjleng in #102
  • feat: be able to display public and private endpoints for MG by @jjleng in #103
  • chore: upgrade prometheus chart by @jjleng in #104
  • feat: default to openai compatible server for vllm by @jjleng in #105
  • feat: run vLLM with a served model name by @jjleng in #106
  • fix: make prometheus scrape the inferrence engine's metrics correctly by @jjleng in #107
  • feat: set knative resource requests by @jjleng in #108
  • feat: dedicated node groups for functions and job workers by @jjleng in #109
  • feat: be able to pass resource requests when creating jobs by @jjleng in #110
  • feat: detect the right ami types for different node types by @jjleng in #111
  • feat: command to push a prebuilt image to the container registry by @jjleng in #112
  • refactor: make the config fields naming consistent by @jjleng in #113
  • feat: enable host ipc for multiple GPU inference by @jjleng in #114
  • docs: rewrite README by @jjleng in #115
  • docs: add sample model launching templates by @jjleng in #116
  • docs: updates to the invoice_extraction example by @jjleng in #117
  • docs: update the website_rag example by @jjleng in #118
  • feat: gateway vpc endpoint for s3 by @jjleng in #119
  • feat: default to have at least one funciton instance running by @jjleng in #120
  • docs: add faq by @jjleng in #121
  • docs: detailed descriptions of the cluster config yaml by @jjleng in #122
  • docs: quick start doc by @jjleng in #123
  • chore: version bump by @jjleng in #124

Full Changelog: v0.1.8...v0.1.9

v0.1.8

19 May 08:13
Compare
Choose a tag to compare

What's Changed

  • feat: support inference with spot instances by @jjleng in #86
  • feat: support mixed model groups by @jjleng in #87
  • feat: flag to make model groups not exposed to the public by @jjleng in #88
  • feat: heuristics for caculating the memory, cpu and gpu resource requests by @jjleng in #89
  • fix: typing issues in e2e test code by @jjleng in #90
  • feat: command to list function revisions by @jjleng in #91
  • feat: command to split traffic among revisions by @jjleng in #92
  • refactor: make typer command args more consistent by @jjleng in #93
  • refactor: save kubeconfig as soon as cluster is provisioned by @jjleng in #94
  • feat: options to pass resource requests and limits when creating fuctions by @jjleng in #95
  • feat: utilize multiple GPUs for vllm inferrence by @jjleng in #96
  • feat: utility to parse gguf metadata by @jjleng in #97
  • fix: permissions to allow cluster autoscaler scale from 0 by @jjleng in #98
  • chore: bump up version by @jjleng in #99
  • chore: add homepage info to pyproject by @jjleng in #100

Full Changelog: v0.1.7...v0.1.8

v0.1.7

08 May 07:00
Compare
Choose a tag to compare

What's Changed

  • refactor: windows compatibility by @jjleng in #68
  • feat: install kubectl as a dependency for pulumi by @jjleng in #69
  • docs: aws CLI is required by @jjleng in #70
  • tests: parse the cluster config yamls of examples by @jjleng in #71
  • refactor: harden the config code and files by @jjleng in #72
  • feat: cleanup stabled model group resources by @jjleng in #73
  • refactor: change opt-in to opt-out for kubeconfig update by @jjleng in #74
  • feat: command to list clusters being managed by @jjleng in #75
  • feat: command to switch current cluster by @jjleng in #76
  • feat: remove local cluster state when a cluster is taken down by @jjleng in #77
  • refactor: get rid of local cluster state by @jjleng in #78
  • fix: make removing crd finalizers more stable by @jjleng in #79
  • feat: turn on flash attention for llama.cpp by default by @jjleng in #80
  • fix: fix misc issues by @jjleng in #81
  • fix: reattempt to delete CRD finalizers in a separate thread by @jjleng in #82
  • feat: support vLLM as inference runtime by @jjleng in #83
  • feat: check for config version forward compatibility by @jjleng in #84
  • chore: bump up version by @jjleng in #85

Full Changelog: v0.1.6...v0.1.7

v0.1.6

29 Apr 21:49
Compare
Choose a tag to compare

What's Changed

  • refactor: restructured code base by @jjleng in #51
  • fix: print messages after closing progress bar by @jjleng in #52
  • refactor: get rid of hard coded model mount path in containers by @jjleng in #53
  • feat: remove k8s resources related to a staled model group by @jjleng in #54
  • fix: make llama.cpp save hf files to addressable local paths by @jjleng in #55
  • fix: the hf model file path problem of the llama.cpp runtime by @jjleng in #56
  • tests: e2e tests with KinD by @jjleng in #57
  • ci: add e2e to ci by @jjleng in #58
  • refactor: make pack be installed to a non-global location by @jjleng in #59
  • docs: get rid of the pack installation info from readme files by @jjleng in #60
  • feat: avoid mutating the model stored in model store by @jjleng in #61
  • refactor: remove aws CLI as a dependency when pushing images to ECR by @jjleng in #62
  • refactor: make the code removing k8s finalizers more robust when taki… by @jjleng in #63
  • refactor: make docker login more secure by @jjleng in #64
  • Isolate paka's pulumi from system pulumi by @jjleng in #65
  • docs: doc updates by @jjleng in #66
  • chore: version bump by @jjleng in #67

Full Changelog: v0.1.5...v0.1.6

v0.1.5

25 Apr 22:57
Compare
Choose a tag to compare

What's Changed

  • feat: add a new model registry by @jjleng in #42
  • test: fix test_registry by @jjleng in #44
  • Add new workflow. by @erika-tsay in #45
  • refactor: remove runtime and devices fields from model class by @jjleng in #46
  • feat: llama.cpp runtime that supports the new model abstraction by @jjleng in #47
  • refactor: make default llama.cpp params more conservative by @jjleng in #48
  • fix: correct the wrong dependency in the invoice example by @jjleng in #49
  • chore: bump version by @jjleng in #50

New Contributors

Full Changelog: v0.1.4...v0.1.5

v0.1.4

19 Apr 04:36
Compare
Choose a tag to compare

What's Changed

  • chore: fix mypy errors by @jjleng in #37
  • test: replace patch with patch.object for eaiser error discovery by @jjleng in #38
  • chore: pin pulumi to a specific version that honors depends_on by @jjleng in #39
  • feat: command to sync cluster state by @jjleng in #40
  • chore: bump version by @jjleng in #41

Full Changelog: v0.1.3...v0.1.4

v0.1.3

18 Apr 06:53
Compare
Choose a tag to compare

What's Changed

  • docs: instructions for installing the pack CLI by @jjleng in #22
  • docs: add pulumi CLI as a dependency by @jjleng in #23
  • Add HuggingFaceModel, HttpSouceModel by @SoftewareArtist in #24
  • fix: fix failing hf model tests by @jjleng in #32
  • Refactor the model and model store abstraction by @jjleng in #33
  • chore: make paka compatible with py3.8 and above by @jjleng in #34
  • chore: remove CHANGELOG.md by @jjleng in #35
  • chore: bump version by @jjleng in #36

Full Changelog: v0.1.2...v0.1.3

v0.1.2

10 Apr 23:11
Compare
Choose a tag to compare

What's Changed

  • docs: example for invoice extraction by @jjleng in #13
  • Add GPU (CUDA) support by @jjleng in #15
  • docs: update README with the GPU support message by @jjleng in #16
  • docs + GPU inference for the invoice extraction example by @jjleng in #17
  • feat: remove finalizers before tearing down a cluster by @jjleng in #18
  • chore: bump version to 0.1.2 by @jjleng in #19

Full Changelog: v0.1.1...v0.1.2