An exploration of JavaScript text splitters.
When building a Retrieval-Augmented Generation (RAG) based app, one of the most important things you need to do is to get your data AI-ready. One of the steps in that process is known as "chunking" as it is used to break down large blocks of text or unstructured data into smaller chunks. Read more about why chunking is important and what to consider here.
In the JavaScript world, there are a few libraries that can help you with chunking your data. This project is an exploration of those tools and you can see the write up in the blog post on how to chunk text in JavaScript for your RAG application.
This is a Next.js application that allows you to experiment with four JavaScript tools that provide different text chunking capabilities. The tools are:
First, clone this repo:
git clone https://github.com/philnash/chunkers.git
cd chunkers
Install the dependencies:
npm install
Then, run the development server:
npm run dev
Open http://localhost:3000 with your browser to see the result.