Llamaste

Llamaste is a Ruby C Extension which wraps llama.cpp to enable using the LLaMA text model inside Ruby.

Installation

Add to your gemfile using Bundler

gem 'llamaste', :git => 'git://github.com/jtp184/llamaste.git'

or clone and install globally using rake compile install

Usage

Model

# Configure model params like filepath, tokens to generate, context size, etc
# These are the defaults (apart from model)
params = {
  model: './models/30B/ggml-model-q4_0.bin',
  seed: Time.now.to_i,
  threads: Etc.nprocessors, # n_threads
  tokens: 128, # n_predict
  penalize_tokens: 64, # repeat_last_n
  parts: 1, # n_parts
  context_size: 512, # n_ctx
  top_k: 40,
  top_p: 0.95,
  temperature: 0.8, # temp
  repeat_penalty: 1.1,
  batch_size: 8, # n_batch
  memory_lock: false, # use_mlock
  memory_f16: false,
  embedding: false,
  use_mmap: true,
  lora_base: nil,
  lora_adapter: nil
}

@model = Llamaste::Model.new(params)

Model Loading

# Load model into memory
@model.load_model
# Optionally choose a different model
@model.load_model('./models/7B/model.bin')

Tokenizing

text_input = 'It was a dark and stormy night'
token = @model.tokenize(text_input)
# => 
# <Llamaste::TokenGroup:0x00007f564a6a9b48                                  
#  @string="It was a dark and stormy night",                                     
#  @tokens=                                                                      
#   [["It", 3112],                                                               
#    [" was", 471],                                                              
#    [" a", 263],                                                                
#    [" dark", 6501],                                                            
#    [" and", 322],                                                              
#    [" storm", 14280],                                                          
#    ["y", 29891],                                                               
#    [" night", 4646]]
# >

Text Generation

# Generate based on tokens or text, returns a string.
# Providing a block will yield a string for each generated token

@model.call(token) { |tkn| print tkn }
# => ", and I was on a plane headed for somewhere, but I didn’t know where."

# Set strings to break early on
@model.call('When it is raining I need to bring my', break_on: ["\n"])
# => " umbrella."

Caching Model State

Evaluate prompt and then save context out to a binary string

# Save to string
File.open('cache.bin', 'w+b') { |f| f << @model.cache_prompt('Something wicked this way') }
# Resume evaluating prompt from binary string
@model.resume_prompt('Something wicked this way', File.binread('cache.bin'))

Embedding

Return an embedding for an input prompt

# Need to set embedding mode and reload context to embed tokens
@model.params[:embedding] = true
@model.load_model

# Create embedding
@model.embed('Pineapple on pizza is') # => TextEmbedding

# K-cluster embeddings, supply embeddings array and k to cluster, and optional max_iterations for clustering
e = EmbeddingCluster.call(
  ['Oxygen is a', 'Nitrogen is a', 'Iron is a', 'Copper is a'].map { |t| @model.embed(t) },
  2,
  500
)

e.clusters.map { |c| c.map(&:to_str) }
# => [["Oxygen is a", "Iron is a", "Copper is a"], ["Nitrogen is a"]]

Contributing

Bug reports, feature interest, and pull requests are welcome on GitHub at https://github.com/jtp184/llamaste.

Project Goals

Remain up to date with llama.cpp
Increase feature offering, configurability
Increase quality of extension C++
~~Simple ChatGPT clone running on local via rails~~
- llamachat

References and Acknowledgements

Georgi Gerganov's amazing llama.cpp code, this gem would not be possible without it
Meta's LLaMA model
The Ruby C API

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
bin		bin
docs		docs
ext/ruby_llama		ext/ruby_llama
lib		lib
spec		spec
.gitignore		.gitignore
.rspec		.rspec
.rubocop.yml		.rubocop.yml
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
Rakefile		Rakefile
dream-llamas.jpg		dream-llamas.jpg
llamaste.gemspec		llamaste.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Llamaste

Installation

Usage

Model

Model Loading

Tokenizing

Text Generation

Caching Model State

Embedding

Contributing

Project Goals

References and Acknowledgements

About

Languages

jtp184/llamaste

Folders and files

Latest commit

History

Repository files navigation

Llamaste

Installation

Usage

Model

Model Loading

Tokenizing

Text Generation

Caching Model State

Embedding

Contributing

Project Goals

References and Acknowledgements

About

Resources

Stars

Watchers

Forks

Languages