mixture-of-depths

Here are 4 public repositories matching this topic...

dingo-actual / infini-transformer

PyTorch implementation of Infini-Transformer from "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" (https://arxiv.org/abs/2404.07143)

deep-learning transformers pytorch attention-mechanism long-context infini-attention mixture-of-depths

Updated May 4, 2024
Python

jaisidhsingh / pytorch-mixtures

Star

One-stop solutions for Mixture of Experts and Mixture of Depth modules in PyTorch.

nlp pytorch language-model pip-package mixture-of-experts large-language-models mixture-of-depths

Updated Dec 5, 2024
Python

CASE-Lab-UMD / Router-Tuning-Mixture-of-Depths

Star

The open-source Mixture of Depths code and the official implementation of the paper "Router-Tuning: A Simple and Effective Approach for Enabling Dynamic Depth in Transformers."

machine-learning deep-learning neural-network large-language-models parameter-efficient-fine-tuning mixture-of-depths

Updated Nov 25, 2024
Python

Mixture-AI / Mixture-of-Depths

Star

Google DeepMind: Mixture of Depths Unofficial Implementation.

conditional-computation llm mixture-of-depths

Updated May 29, 2024
Python

Improve this page

Add a description, image, and links to the mixture-of-depths topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mixture-of-depths topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mixture-of-depths

Here are 4 public repositories matching this topic...

dingo-actual / infini-transformer

jaisidhsingh / pytorch-mixtures

CASE-Lab-UMD / Router-Tuning-Mixture-of-Depths

Mixture-AI / Mixture-of-Depths

Improve this page

Add this topic to your repo