Skip to content

llamafile v0.2.1

Compare
Choose a tag to compare
@jart jart released this 01 Dec 18:51
· 633 commits to main since this release
57cc1f4

llamafile lets you distribute and run LLMs with a single file. See our README file for documentation and to learn more.

Changes

  • 95703b6 Fix support for old Intel CPUs
  • 401dd08 Add OpenAI API compatibility to server
  • e5c2315 Make server open tab in browser on startup
  • 865462f Cherry pick StableLM support from llama.cpp
  • 8f21460 Introduce pledge() / seccomp security to llama.cpp
  • 711344b Fix server so it doesn't consume 100% cpu when idle
  • 12f4319 Add single-client multi-prompt support to server
  • c64989a Add --log-disable flag to server
  • 90fa20f Fix typical sampling (#4261)
  • e574488 reserve space in decode_utf8
  • 481b6a5 Look for GGML DSO before looking for NVCC
  • 41f243e Check for i/o errors in httplib read_file()
  • ed87fdb Fix uninitialized variables in server
  • c5d35b0 Avoid CUDA assertion error with some models
  • c373b5d Fix LLaVA regression for square images
  • 176e54f Fix server crash when prompt exceeds context size

Example Llamafiles

Our .llamafiles on Hugging Face have been updated to incorporate these new release binaries. You can redownload here:

If you have a slower Internet connection and don't want to re-download, then you don't have to! Instructions are here: