Skip to content

Implementation of Vision Transformers (ViT) with a token merging mechanism

Notifications You must be signed in to change notification settings

Ctrl408/ViT-implementations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Vision Transformer with Token Merging using Bipartite Soft Merging

This repository contains an implementation of Vision Transformers (ViT) with a token merging mechanism using Bipartite Soft Merging from the paper https://arxiv.org/abs/2210.09461. The objective is to enhance the throughput of Vision Transformers by merging tokens in an adaptive manner. Includes training code.

Introduction

Features

  • Vision Transformer (ViT) Implementation: Based on the original ViT architecture.
  • Bipartite Soft Merging:merge tokens effectively, reducing computational load.

Installation

To get started, clone the repository and install the necessary dependencies:

git clone https://github.com/Ctrl408/ViT-implementations.git
cd ViT-implementations