🤗 Bamba on Hugging Face | Bamba Blog
Bamba is a repository for training and using Bamba models which are based on Mamba models.
Besides PyTorch, you would need a few extra dependencies for Mamba models.
We found some of these dependencies picky on PyTorch versions when doing pip install, so the best way is to build from source for all Mamba dependencies if you hit dependency issue with your env:
git clone https://github.com/Dao-AILab/causal-conv1d.git
cd causal-conv1d && pip install . && cd ..
git clone https://github.com/state-spaces/mamba.git
cd mamba && pip install . && cd ..
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention && pip install . && cd ..
TODO: add model card here
We have published our model checkpoints here: TODO: add mamba HF page once public
You can utilize our newly contributed HF integration to run inference on our Bamba models:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")
tokenizer = AutoTokenizer.from_pretrained("ibm-fms/Avengers-Mamba2-9B-hf")
message = ["TODO: find a prompt here"]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = model.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
We trained our Bamba model with FSDP using our training repo here.
Note that this training effort was started before FSDP2 and also long before we contributed
Mamba2-Hybrid
to HF, so we were doing FSDP1 training with official Mamba implementation.
For users trying to reproduce the training you now have much more options with our newly
contributed HF-version of Mamba2-Hybrid (TODO: add link once live).