Skip to content

An exploration of text splitting and chunking in JavaScript

License

Notifications You must be signed in to change notification settings

philnash/chunkers

Repository files navigation

Chunkers

An exploration of JavaScript text splitters.

What is chunking?

When building a Retrieval-Augmented Generation (RAG) based app, one of the most important things you need to do is to get your data AI-ready. One of the steps in that process is known as "chunking" as it is used to break down large blocks of text or unstructured data into smaller chunks. Read more about why chunking is important and what to consider here.

In the JavaScript world, there are a few libraries that can help you with chunking your data. This project is an exploration of those tools and you can see the write up in the blog post on how to chunk text in JavaScript for your RAG application.

The project

This is a Next.js application that allows you to experiment with four JavaScript tools that provide different text chunking capabilities. The tools are:

Running the project

First, clone this repo:

git clone https://github.com/philnash/chunkers.git
cd chunkers

Install the dependencies:

npm install

Then, run the development server:

npm run dev

Open http://localhost:3000 with your browser to see the result.