Skip to content

Releases: llmariner/job-manager

v1.8.0

25 Feb 22:47
a51ebff
Compare
Choose a tag to compare

What's Changed

Features

  • feat(proto): add namespaced_name field to GpuPod by @Ladicle in #384
  • feat(dispatcher): ingest a namespaced name to GpuPod by @Ladicle in #385
  • feat(server): cache cluster state and reserved scheduled resources by @Ladicle in #386
  • feat: implement a scheduling scoring algorithm by @kkaneda in #388
  • feat(syncer): update local job status when failed to apply a job by @Ladicle in #392
  • feat: validate the cluster registration key by @kkaneda in #393

Bug Fixes

  • fix(syncer): explicitly specify the deletion propagation policy by @Ladicle in #389
  • fix: bump rbac-manager by @kkaneda in #394
  • fix: tweak the requirements.txt for fine-tuning docker image by @kkaneda in #395

Full Changelog: v1.7.0...v1.8.0

v1.7.0

10 Feb 17:50
afdb858
Compare
Choose a tag to compare

What's Changed

Features

Bug Fixes

Full Changelog: v1.6.0...v1.7.0

v1.6.0

07 Feb 07:48
323323a
Compare
Choose a tag to compare

What's Changed

Features

  • feat: allow non-gpu workloads to schedule to a no-gpu cluster by @kkaneda in #354
  • feat(syncer): add a syncer to reflect job status in local resources by @Ladicle in #360
  • feat: expose ListClusters and add more fields by @kkaneda in #364
  • feat: populate the cluster name in ListClusters response by @kkaneda in #370
  • feat: populate the gpu capacity in ListClusters response by @kkaneda in #371
  • feat: track pods that use GPUs in cluster status by @kkaneda in #372
  • feat: populate gpu_allocated and gpu_pod_count by @kkaneda in #373
  • feat(server/syncer): add ListClusterIDs service by @Ladicle in #375
  • feat(server/syncer): set up auth intercepter by @Ladicle in #374
  • feat(syncer): support authentication between syncer and control-plane by @Ladicle in #376
  • feat: bump rbac-manager dep by @kkaneda in #377

Bug Fixes

  • fix: register the HTTP handler for /v1/jobs by @kkaneda in #368

Full Changelog: v1.5.0...v1.6.0

v1.5.0

24 Jan 20:55
396253f
Compare
Choose a tag to compare

What's Changed

Features

  • feat: persist schedulable envs to database by @kkaneda in #311
  • feat(api): add an API for sending the cluster status by @kkaneda in #313
  • feat: implement UpdateClusterStatus by @kkaneda in #314
  • feat(api): add the "gpu_nodes" field in ClusterStatus by @kkaneda in #317
  • feat(api): add JobService and ListClusters by @kkaneda in #318
  • feat(dispatcher): send cluster info to server by @kkaneda in #319
  • feat(server): implement the scheduler by @kkaneda in #320
  • feat(server): Ignore stale clusters from scheduling candidates by @kkaneda in #321
  • feat(engine): ignore cordoned GPU nodes from cluster status by @kkaneda in #322
  • feat(dispatcher): be able to configure cluster status update interval by @kkaneda in #323
  • feat(dispatcher): pull queued workloads only from an assigned cluster by @kkaneda in #325
  • feat(server): add index to the cluster_id column of clusters by @kkaneda in #327
  • feat(api): add SyncerService for put and delete k8s objects by @Ladicle in #332
  • feat(server): add syncer service server by @Ladicle in #333
  • feat(chart): add logLevel fields by @Ladicle in #335
  • feat(server): implement syncer service APIs by @Ladicle in #336
  • feat(syncer): add empty syncer component by @Ladicle in #339
  • feat(chart): add syncer chart by @Ladicle in #340
  • feat(notebooks): set env var for an org ID and a project ID by @kkaneda in #346
  • feat(syncer): add job controller by @Ladicle in #347
  • feat(proto): add state and action for rescheduling jobs by @guangrui-cloudnatix in #348
  • feat(server): add fields in notebook table by @guangrui-cloudnatix in #349
  • feat(dispatcher): set env vars for org and project when creating a no… by @kkaneda in #350
  • feat: add org/project title and cluster name to proto by @kkaneda in #351

Bug Fixes

  • fix(server): add ingress path for /llmariner.jobs.server.v1.JobWorkerService` by @kkaneda in #324
  • fix(server): sort the cluster list in ListClusters by @kkaneda in #326
  • fix(chart): fix line handling for server ingress resources by @Ladicle in #334

Other Changes

  • Revert "feat: persist schedulable envs to database (#311)" by @kkaneda in #312

Full Changelog: v1.4.1...v1.5.0

v1.4.1

07 Dec 14:59
d1df66c
Compare
Choose a tag to compare

What's Changed

Bug Fixes

  • fix(dispatcher): set distinct names to the controllers by @kkaneda in #306
  • fix: bump cluster-manager dep to fix health reporting by @kkaneda in #307
  • fix: use a beacon health check by @kkaneda in #308

Full Changelog: v1.4.0...v1.4.1

v1.4.0

06 Dec 21:26
e71f1d8
Compare
Choose a tag to compare

What's Changed

Features

Full Changelog: v1.3.0...v1.4.0

v1.3.0

25 Nov 05:58
a22ff3d
Compare
Choose a tag to compare

What's Changed

Features

  • feat(fine-tuning): bump the transformer to the latest version by @kkaneda in #297
  • feat(fine-tuning): install autoawq by @kkaneda in #298
  • feat(fine-tuning): make BitsAndBytesQuantization quantization optional by @kkaneda in #300

Bug Fixes

  • fix(dispatcher): do not pull models that have the same prefix by @kkaneda in #299
  • fix(chart): unset the default value of enable by @Ladicle in #301

Full Changelog: v1.2.0...v1.3.0

v1.2.0

17 Nov 23:55
bc5264f
Compare
Choose a tag to compare

What's Changed

Features

  • feat(chart): add an enable option for the dependency condition by @Ladicle in #292
  • feat(server): support an enable flag for the usage sender by @Ladicle in #293
  • feat(mod): upgrade llmariner modules by @Ladicle in #294

Bug Fixes

  • fix(chart): do not set the default value of s3.endpoinrUrl by @kkaneda in #290

Full Changelog: v1.1.0...v1.2.0

v1.1.0

01 Nov 12:37
e9e05d1
Compare
Choose a tag to compare

What's Changed

Features

Other Changes

  • chore(Makefile): fix non-working go-fmt target by invalid indents by @Ladicle in #282
  • docs: Update Copyright in LICENSE by @kkaneda in #284
  • fix(dipatcher): add config validation for KubernetesManager by @kkaneda in #283

Full Changelog: v1.0.0...v1.1.0

v1.0.0

15 Oct 08:11
35bddfa
Compare
Choose a tag to compare

What's Changed

  • Configure release workflows by @Ladicle in #280
  • Release v1.0.0 by @github-actions in #281

New Contributors

  • @github-actions made their first contribution in #281

Full Changelog: v0.222.0...v1.0.0