CodeLLM is an extensible LLM agent pipeline framework that can be used to build a system of AI agents, tools, and datastores that interact to solve complex problems. It was initially inspired by the groundcrew project. It is initially focused on code and codebase analysis, but could be extended to other domains.
It is still very much a POC and is not ready for use beyond experimentation.
The system is composed of a few main components. It is designed to be extensible and pluggable. The main components are:
- CLI: A simple command line interface to interact with the system.
- Core: The core of the system. It is responsible for orchestrating the agents and tools.
- Providers: These are a standard interface for various LLM providers.
- Tools: These are the tools that the agent uses to gather data. They are responsible for taking a query and returning data. They are not responsible for making the query to the provider.
- Remix: A web interface that allows you to ask questions about the codebase and see the results.
- VectorDbs: A vector database that stores embeddings of code files and other data.
- nvm (Node Version Manager) - Installation Instructions
- docker - Installation Instructions
The system currently supports ollama and openai providers. You will need to have ollama running locally or configure an API key for openai.
nvm use
npm ci
The first step is to start the vector db. This will currently start a local instance of chromadb that listens on http://localhost:8000 persists data to the .chromadb
directory in the root of the project.
npm run start:datastore
To stop the datastore:
npm run stop:datastore
To view logs from the datastore:
npm run logs:datastore
If you have git-lfs installed, you can use the archived chromadb to import the embeddings. This is the fastest way to get started.
npm run extract:chromadb
The first step is to import the embeddings for your codebase. We use a locally running chromadb instance as a vector db. The initial import will take a while. It can be run again to update the embeddings and will only import updated and new files.
By default it will import ts code from this repository. You can change the CODELLM_IMPORT_PATH
environment variable to point to a different codebase, modify the cli/config.yml
file, or create a new yaml config file and set the CODELLM_CONFIG
environment variable to point to it.
npm run start:import
This assumes you have ollama running locally on the default port.
npm start
This assumes you have an API key for anthropic set as an environment variable: ANTHROPIC_API_KEY
.
CODELLM_PROVIDER=anthropic npm start
This assumes you have an API key for mistral set as an environment variable: MISTRAL_API_KEY
.
CODELLM_PROVIDER=mistral npm start
This assumes you have an API key for openai set as an environment variable: OPENAI_API_KEY
.
CODELLM_PROVIDER=openai npm start
The remix app is a simple web interface that allows you to ask questions about the codebase and see the results. It is a work in progress.
npm run dev