Skip to content

chichun-charlie-liu/bamba

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Bamba

🤗 Bamba on Hugging Face  | Bamba Blog 

Bamba is a repository for training and using Bamba models which are based on Mamba models.

Installation

Besides PyTorch, you would need a few extra dependencies for Mamba models.

We found some of these dependencies picky on PyTorch versions when doing pip install, so the best way is to build from source for all Mamba dependencies if you hit dependency issue with your env:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal-conv1d && pip install . && cd ..
git clone https://github.com/state-spaces/mamba.git
cd mamba && pip install . && cd ..
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention && pip install . && cd ..

Models

Overview

TODO: add model card here

Checkpoints

We have published our model checkpoints here: TODO: add mamba HF page once public

Inference

You can utilize our newly contributed HF integration to run inference on our Bamba models:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")
tokenizer = AutoTokenizer.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")

message = ["TODO: find a prompt here"]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = model.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])

Training

We trained our Bamba model with FSDP using our training repo here. Note that this training effort was started before FSDP2 and also long before we contributed Mamba2-Hybrid to HF, so we were doing FSDP1 training with official Mamba implementation. For users trying to reproduce the training you now have much more options with our newly contributed HF-version of Mamba2-Hybrid (TODO: add link once live).

Fine-tuning

Evaluation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published