Releases: nomic-ai/gpt4all
Releases · nomic-ai/gpt4all
v3.0.0
What's New
- Complete UI overhaul (#2396)
- LocalDocs improvements (#2396)
- Use nomic-embed-text-v1.5 as local model instead of SBert
- Ship local model with application instead of downloading afterwards
- Store embeddings flat in SQLite DB instead of in hnswlib index
- Do exact KNN search with usearch instead of approximate KNN search with hnswlib
- Markdown support (#2476)
- Support CUDA/Metal device option for embeddings (#2477)
Fixes
- Fix embedding tokenization after #2310 (#2381)
- Fix a crash when loading certain models with "code" in their name (#2382)
- Fix an embedding crash with large chunk sizes after #2310 (#2383)
- Fix inability to load models with non-ASCII path on Windows (#2388)
- CUDA: Do not show non-fatal DLL errors on Windows (#2389)
- LocalDocs fixes (#2396)
- Always use requested number of snippets even if there are better matches in unselected collections
- Check for deleted files on startup
- CUDA: Fix PTX errors with some GPT4All builds (#2421)
- Fix blank device in UI after model switch and improve usage stats (#2409)
- Use CPU instead of CUDA backend when GPU loading fails the first time (ngl=0 is not enough) (#2477)
- Fix crash when sending a message greater than n_ctx tokens after #1970 (#2498)
New Contributors
- @woheller69 made their first contribution in (#2339)
- @patcher9 made their first contribution in (#2386)
- @sunsided made their first contribution in (#2414)
- @johnwparent made their first contribution in (#2319)
- @mcembalest made their first contribution in (#2488)
Full Changelog: v2.8.0...v3.0.0
v2.8.0
What's New
- Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
- Context Menu: Hide Copy/Cut when nothing is selected (#2324)
- Improve speed of context switch after quickly switching between several chats (#2343)
- New Chat: Always switch to the new chat when the button is clicked (#2330)
- New Chat: Always scroll to the top of the list when the button is clicked (#2330)
- Update to latest llama.cpp as of May 9, 2024 (#2310)
- Add support for the llama.cpp CUDA backend (#2310, #2357)
- Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
- When in use: Greatly improved prompt processing and generation speed on some devices
- When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
- Add support for InternLM models (#2310)
Fixes
- Do not allow sending a message while the LLM is responding (#2323)
- Fix poor quality of generated chat titles with many models (#2322)
- Set the window icon correctly on Windows (#2321)
- Fix a few memory leaks (#2328, #2348, #2310)
- Do not crash if a model file has no architecture key (#2346)
- Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
- New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
- macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
- Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
- Fix GPT4All.desktop being created by offline installers on macOS (#2361)
Full Changelog: v2.7.5...v2.8.0
v2.8.0-pre1
What's New
- Context Menu: Replace "Select All" on message with "Copy Message" (#2324)
- Context Menu: Hide Copy/Cut when nothing is selected (#2324)
- Improve speed of context switch after quickly switching between several chats (#2343)
- New Chat: Always switch to the new chat when the button is clicked (#2330)
- New Chat: Always scroll to the top of the list when the button is clicked (#2330)
- Update to latest llama.cpp as of May 9, 2024 (#2310)
- Add support for the llama.cpp CUDA backend (#2310)
- Nomic Vulkan is still used by default, but CUDA devices can now be selected in Settings
- When in use: Greatly improved prompt processing and generation speed on some devices
- When in use: GPU support for Q5_0, Q5_1, Q8_0, K-quants, I-quants, and Mixtral
- Add support for InternLM models (#2310)
Fixes
- Do not allow sending a message while the LLM is responding (#2323)
- Fix poor quality of generated chat titles with many models (#2322)
- Set the window icon correctly on Windows (#2321)
- Fix a few memory leaks (#2328, #2348, #2310)
- Do not crash if a model file has no architecture key (#2346)
- Fix several instances of model loading progress displaying incorrectly (#2337, #2343)
- New Chat: Fix the new chat being scrolled above the top of the list on startup (#2330)
- macOS: Show a "Metal" device option, and actually use the CPU when "CPU" is selected (#2310)
- Remove unsupported Mamba, Persimmon, and PLaMo models from the whitelist (#2310)
Full Changelog: v2.7.5...v2.8.0-pre1
v2.7.5
What's New
Fixes
- Fix some issues with anonymous usage statistics (#2270, #2296)
- Default to GPU with most VRAM on Windows and Linux, not least (#2297)
- Fix initial failure to generate embeddings with Nomic Embed (#2284)
New Contributors
Full Changelog: v2.7.4...v2.7.5
v2.7.4
What's New
- Add a right-click menu to the chat (by @kryotek777 in #2108)
- Change the left sidebar to stay open (#2117)
- Limit the width of text in the chat (#2118)
- Move to llama.cpp's SBert implementation (#2086)
- Support models provided by the Mistral AI API (by @Olyxz16 in #2053)
- Models List: Add Ghost 7B v0.9.1 (by @lh0x00 in #2127)
- Add Documentation and FAQ links to the New Chat page (by @3Simplex in #2183)
- Models List: Simplify Mistral OpenOrca system prompt (#2220)
- Models List: Add Llama 3 Instruct (#2242)
- Models List: Add Phi-3 Mini Instruct (#2252)
- Improve accuracy of anonymous usage statistics (#2238)
Fixes
- Detect unsupported CPUs correctly on Windows (#2141)
- Fix the colors used by the server chat (#2150)
- Fix startup issues when encountering non-Latin characters in paths (#2162)
- Fix issues causing LocalDocs context links to not work sometimes (#2218)
- Fix incorrect display of certain code block syntax in the chat (#2232)
- Fix an issue causing unnecessary indexing of document collections on startup (#2236)
New Contributors
- @kryotek777 made their first contribution in #2108
- @xuzhen made their first contribution in #1928
- @Olyxz16 made their first contribution in #2053
- @bentleylong made their first contribution in #2138
- @Tim453 made their first contribution in #2136
- @lh0x00 made their first contribution in #2127
- @robinverduijn made their first contribution in #2180
- @3Simplex made their first contribution in #2183
Full Changelog: v2.7.3...v2.7.4
v2.7.3
What's New
- Groundwork for "removedIn" field of models3.json based on unused "deprecated" field (#2063)
- Implement warning dialog for old Mistral OpenOrca (#2034)
- Make deleting a chat significantly faster (#2081)
- New, smaller MPT model without duplicated tensor (#2006)
- Make API server port configurable by @danielmeloalencar in #1640
- Show settings for the currently selected model by default (by @chrisbarrera in #2099, e2f64f8)
- Keep installed models in the list when searching for models (5ed9aea)
Fixes
- Fix "Download models" button not appearing on some Linux systems (#2040)
- Fix undefined behavior in ChatLLM::resetContext (#2041)
- Fix ChatGPT restore from text after v2.7.1 (#2051 part 1)
- Fix ChatGPT using context from messages that are no longer in history (#2051 part 2)
- Fix TypeError warnings on exit (#2043)
- Fix startup speed regression from v2.7.2 (#2089, #2094)
- Do not show SBert and non-GUI models in choices for "Default model" (#2095)
- Do not list cloned models on downloads page (#2090)
- Do not attempt to show old, deleted models in downloads list (#2098)
- Fix inability to cancel model download (#2107)
New Contributors
- @danielmeloalencar made their first contribution in #1640
- @johannesploetner made their first contribution in #1979
Full Changelog: v2.7.2...v2.7.3
v2.7.2
What's New
- Model Discovery: Discover new LLMs from HuggingFace, right from GPT4All! (83c76be)
- Support GPU offload of Gemma's output tensor (#1997)
- Enable Kompute support for 10 more model architectures (#2005)
- These are Baichuan, Bert and Nomic Bert, CodeShell, GPT-2, InternLM, MiniCPM, Orion, Qwen, and StarCoder.
- Expose min_p sampling parameter of llama.cpp by @chrisbarrera in #2014
- Default to a blank line between reply and next prompt for templates without
%2
(#1996) - Add Nous-Hermes-2-Mistral-7B-DPO to official models list by @ThiloteE in #2027
Fixes
- Fix compilation warnings on macOS (e7f2ff1)
- Fix crash when ChatGPT API key is set, and hide non-ChatGPT settings properly (#2003)
- Fix crash when adding/removing a clone - a regression in v2.7.1 (#2031)
- Fix layer norm epsilon value in BERT model (#1946)
- Fix clones being created with the wrong number of GPU layers (#2011)
New Contributors
- @TareHimself made their first contribution in #1897
Full Changelog: v2.7.1...v2.7.2
v2.7.1
What's Changed
- Completely revamp model loading to support explicit unload/reload (#1969)
- We no longer load a model by default on application start
- We no longer load a model by default on chat context switch
- Save and restore of window geometry across application starts (#1989)
- Update to latest llama.cpp as of 2/21/2024 and add CPU/GPU support for Gemma (#1992)
- Also enable Vulkan GPU support for Phi and Phi-2, Qwen2, and StableLM
Fixes
- Fix visual artifact in update reminder dialog (16927d9)
- Blacklist Intel GPUs as they are still not supported (a1471be, nomic-ai/llama.cpp#14)
- Improve chat save/load speed (excluding startup/shutdown with defaults) (6fdec80, nomic-ai/llama.cpp#15)
- Significantly improve handling of chat-style prompt templates, and reupload Mistral OpenOrca (#1970, #1993)
New Contributors
Full Changelog: v2.7.0...v2.7.1
v2.7.0
What's Changed
- Add 12 new model architectures for CPU and Metal inference (#1914)
These are Baichuan, BLOOM, CodeShell, GPT-2, Orion, Persimmon, Phi and Phi-2, Plamo, Qwen, Qwen2, Refact, and StableLM.
We don't have official downloads for these yet, but TheBloke offers plenty of compatible GGUF quantizations. - Restore minimum window size of 720x480 (1b524c4)
- Use ChatML for Mistral OpenOrca to make its output formatting more reliable (#1935)
Bug Fixes
- Fix VRAM not being freed when CPU fallback occurs - this makes switching models more reliable (#1901)
- Disable offloading of Mixtral to GPU because we crash otherwise (#1931)
- Limit extensions scanned by LocalDocs to txt, pdf, md, rst - other formats were inserting useless binary data (#1912)
- Fix missing scrollbar for chat history (490404d)
- Accessibility improvements (4258bb1)
New Contributors
Full Changelog: v2.6.2...v2.7.0
v2.5.1
(The previous release of this version was mis-tagged - the release has been re-created and installers have been reuploaded.)
What's Changed
- Removed extraneous and shortened text in accessibility name and description fields by @vick08 in #1532
- Fix an issue on Windows where the UI wouldn't start, or stayed open in the background after closing it (#1556)
New Contributors
Full Changelog: v2.5.0...v2.5.1