Computing sentence categories using Large Language Models and Formulaic

A summary of interesting statsistics is available on the Statistics page.

Quick start:

Looking to run full benchmarking suite? Heres how:

git clone project locally
ensure node is installed/accessible
create Formulaic account a. create Formulaic API key
copy env-template to a new file named .env
add env values (e.g. API key)
run npm install to install dependencies
run npm start to fetch all the data and store it in SQLite database

Start using data (no keys required)

Computed results (aka the data from the LLMs) is stored in ./cv-sentence.db in a SQLite file that can be accesses using any SQLite-compatible client.

The database has a table called sentence_domains with the schema like:

value TEXT
user_domain TEXT
total_tokens INTEGER
model TEXT
id INTEGER
created_at DATE
computed_domain TEXT

Recreate Statistics.md

New data? Get new insights by running npm run generate-stats which creates an easy-to-read markdown file with a series of tables (see queries/)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
queries		queries
scripts		scripts
.gitignore		.gitignore
DataClasses.js		DataClasses.js
DataReader.js		DataReader.js
README.md		README.md
SentenceDomainRepository.js		SentenceDomainRepository.js
SentenceService.js		SentenceService.js
Statistics.md		Statistics.md
config.js		config.js
cv-sentence.db		cv-sentence.db
env-template		env-template
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computing sentence categories using Large Language Models and Formulaic

Quick start:

Start using data (no keys required)

Recreate Statistics.md

About

Releases

Packages

Languages

Mozilla-Ocho/multilingual-sentence-evals

Folders and files

Latest commit

History

Repository files navigation

Computing sentence categories using Large Language Models and Formulaic

Quick start:

Start using data (no keys required)

Recreate Statistics.md

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages