Skip to content

Releases: NVIDIA/NeMo-Guardrails

Release v0.6.0

13 Dec 21:59
cc598c3
Compare
Choose a tag to compare

This release builds on the feedback received over the last few months and brings many improvements and new features. It is also the first beta release for NeMo Guardrails. Equally important, this release is the first to include LLM vulnerability scan results for one of the sample bots.

Release highlights include:

  • Better configuration and support for input, output, dialog, retrieval, and execution rails.
  • Ability to reduce the overall latency using single_call mode or embeddings_only mode for dialog rails.
  • Support for streaming.
  • First version of the Guardrails Library.
  • Fast fact-checking using AlignScore.
  • Updated Getting Started guide.
  • Docker image for easy deployment.

Detailed changes are included below.


Added

Changed

  • Allow context data directly in the /v1/chat/completion using messages with the type "role".
  • Allow calling a subflow whose name is in a variable, e.g. do $some_name.
  • Allow using actions which are not async functions.
  • Disabled pretty exceptions in CLI.
  • Upgraded dependencies.
  • Updated the Getting Started Guide.
  • Main README now provides more details.
  • Merged original examples into a single ABC Bot and removed the original ones.
  • Documentation improvements.

Fixed

  • Fix going over the maximum prompt length using the max_length attribute in Prompt Templates.
  • Fixed problem with nest_asyncio initialization.
  • #144 Fixed TypeError in logging call.
  • #121 Detect chat model using openai engine.
  • #109 Fixed minor logging issue.
  • Parallel flow support.
  • Fix HuggingFacePipeline bug related to LangChain version upgrade.

Release v0.5.0

04 Sep 20:36
cb07be6
Compare
Choose a tag to compare
Release v0.5.0 Pre-release
Pre-release

This release adds support for custom embedding search providers (not using Annoy/SentenceTransformers) and support for OpenAI embeddings for the default embedding search provider. This release adds an advanced example for using multiple knowledge bases (i.e., a tabular and regular one). This release also fixes an old issue related to using the generate method inside an async environment (e.g., a notebook) and includes multiple small fixes. Detailed change log below.


Added

Changed

  • Moved to using nest_asyncio for implementing the blocking API. Fixes #3 and #32.
  • Improved event property validation in new_event_dict.
  • Refactored imports to allow installing from source without Annoy/SentenceTransformers (would need a custom embedding search provider to work).

Fixed

  • Fixed when the init function from config.py is called to allow custom LLM providers to be registered inside.
  • #93: Removed redundant hasattr check in nemoguardrails/llm/params.py.
  • #91: Fixed how default context variables are initialized.

Release v0.4.0

03 Aug 14:44
12c49bb
Compare
Choose a tag to compare
Release v0.4.0 Pre-release
Pre-release

This release focused on multiple areas:

  1. Extending the guardrails interface to support generic events.
  2. Adding experimental support for running a red teaming process.
  3. Adding experimental support for vicuna-7b-v1.3 and mpt-7b-instruct.
  4. Extending Colang 1.0 with support for bot message instructions and using variables inside bot message definitions.
  5. Fixing several bugs reported by the community.

Detailed change log below.


Added

Changed

  • Changed the naming of the internal events to align to the upcoming UMIM spec (Unified Multimodal Interaction Management).
  • If there are no user message examples, the bot messages examples lookup is disabled as well.

Fixed

  • #58: Fix install on Mac OS 13.
  • #55: Fix bug in example causing config.py to crash on computers with no CUDA-enabled GPUs.
  • Fixed the model name initialization for LLMs that use the model kwarg.
  • Fixed the Cohere prompt templates.
  • #55: Fix bug related to LangChain callbacks initialization.
  • Fixed generation of "..." on value generation.
  • Fixed the parameters type conversion when invoking actions from colang (previously everything was string).
  • Fixed model_kwargs property for the WrapperLLM.
  • Fixed bug when stop was used inside flows.
  • Fixed Chat UI bug when an invalid guardrails configuration was used.

Release v0.3.0.

30 Jun 17:30
40889af
Compare
Choose a tag to compare
Release v0.3.0. Pre-release
Pre-release

This release focuses on enhancing the support to integrate additional LLMs with NeMo Guardrails. It adds the ability to customize the prompt for various LLMs, including support for completion and chat models. This release adds examples for using the HuggingFace pipeline and inference endpoints. Last but not least, this release provides an initial evaluation of the core prompting technique and some of the rails.

Added

  • Support for defining subflows.
  • Improved support for customizing LLM prompts
    • Support for using filters to change how variables are included in a prompt template.
    • Output parsers for prompt templates.
    • The verbose_v1 formatter and output parser to be used for smaller models that don't understand Colang very well in a few-shot manner.
    • Support for including context variables in prompt templates.
    • Support for chat models i.e. prompting with a sequence of messages.
  • Experimental support for allowing the LLM to generate multi-step flows.
  • Example of using Llama Index from a guardrails configuration (#40).
  • Example for using HuggingFace Endpoint LLMs with a guardrails configuration.
  • Example for using HuggingFace Pipeline LLMs with a guardrails configuration.
  • Support to alter LLM parameters passed as model_kwargs in LangChain.
  • CLI tool for running evaluations on the different steps (e.g., canonical form generation, next steps, bot message) and on existing rails implementation (e.g., moderation, jailbreak, fact-checking, and hallucination).
  • Initial evaluation results for text-davinci-003 and gpt-3.5-turbo.
  • The lowest_temperature can be set through the guardrails config (to be used for deterministic tasks).

Changed

  • The core templates now use Jinja2 as the rendering engines.
  • Improved the internal prompting architecture, now using an LLM Task Manager.

Fixed

  • Fixed bug related to invoking a chain with multiple output keys.
  • Fixed bug related to tracking the output stats.
  • #51: Bug fix - avoid str concat with None when logging user_intent.
  • #54: Fix UTF-8 encoding issue and add embedding model configuration.

Release v0.2.0

01 Jun 18:47
95032c1
Compare
Choose a tag to compare
Release v0.2.0 Pre-release
Pre-release
Update CHANGELOG and setup.py.