Skip to content

Latest commit

 

History

History
23 lines (15 loc) · 2.21 KB

README.md

File metadata and controls

23 lines (15 loc) · 2.21 KB

Codestral Mamba for VSCode

An API which mocks Llama.cpp to enable support for Codestral Mamba with the Continue Visual Studio Code extension.

As of the time of writing and to my knowledge, this is the only way to use Codestral Mamba with VSCode locally. To make it work, we implement the /completion REST API from Llama.cpp's HTTP server and configure Continue for VSCode to use our server instead of Llama.cpp's. This way we handle all inference requests from Continue instead of Llama.cpp. When we get a request, we simply pass it off to mistral-inference which runs Continue's request with Codestral Mamba. Platform support is available wherever mistral-inference can be run.

Now let's get started!

Setup

Prerequisites:

After you are able to use both independently, we will glue them together with Codestral Mamba for VSCode.

Steps:

  1. Install Flask to your mistral-inference environment with pip install flask.
  2. Run llamacpp_mock_api.py with python llamacpp_mock_api.py <path_to_codestral_folder_here> under your mistral-inference environment.
  3. Click the settings button at the bottom right of Continue's UI in VSCode and make changes to config.json so it looks like this[archive]. Replace MODEL_NAME with mistral-8x7b.

Restart VSCode or reload the Continue extension and you should now be able to use Codestral Mamba for VSCode!