Vision Transformer with Token Merging using Bipartite Soft Merging

This repository contains an implementation of Vision Transformers (ViT) with a token merging mechanism using Bipartite Soft Merging from the paper https://arxiv.org/abs/2210.09461. The objective is to enhance the throughput of Vision Transformers by merging tokens in an adaptive manner. Includes training code.

Introduction

Features

Vision Transformer (ViT) Implementation: Based on the original ViT architecture.
Bipartite Soft Merging:merge tokens effectively, reducing computational load.

Installation

To get started, clone the repository and install the necessary dependencies:

git clone https://github.com/Ctrl408/ViT-implementations.git
cd ViT-implementations

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
vision_transformer		vision_transformer
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision Transformer with Token Merging using Bipartite Soft Merging

Introduction

Features

Installation

About

Releases 1

Packages

Languages

Ctrl408/ViT-implementations

Folders and files

Latest commit

History

Repository files navigation

Vision Transformer with Token Merging using Bipartite Soft Merging

Introduction

Features

Installation

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages