Bamba

Bamba is a repository for training and using Bamba models which are based on Mamba models.

Installation

Besides PyTorch, you would need a few extra dependencies for Mamba models.

We found some of these dependencies picky on PyTorch versions when doing pip install, so the best way is to build from source for all Mamba dependencies if you hit dependency issue with your env:

git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal-conv1d && pip install . && cd ..
git clone https://github.com/state-spaces/mamba.git
cd mamba && pip install . && cd ..
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention && pip install . && cd ..

Models

Overview

TODO: add model card here

Checkpoints

We have published our model checkpoints here: TODO: add mamba HF page once public

Inference

You can utilize our newly contributed HF integration to run inference on our Bamba models:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")
tokenizer = AutoTokenizer.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")

message = ["TODO: find a prompt here"]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = model.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])

Training

We trained our Bamba model with FSDP using our training repo here. Note that this training effort was started before FSDP2 and also long before we contributed Mamba2-Hybrid to HF, so we were doing FSDP1 training with official Mamba implementation. For users trying to reproduce the training you now have much more options with our newly contributed HF-version of Mamba2-Hybrid (TODO: add link once live).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
bamba.jpeg		bamba.jpeg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bamba

Installation

Models

Overview

Checkpoints

Inference

Training

Fine-tuning

Evaluation

About

Releases

Packages

chichun-charlie-liu/bamba

Folders and files

Latest commit

History

Repository files navigation

Bamba

Installation

Models

Overview

Checkpoints

Inference

Training

Fine-tuning

Evaluation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages