hubert

This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…

music music-information-retrieval beat-tracking self-supervised singing-voice hubert linear-transformer wavlm

Updated Sep 4, 2022
Python

backspacetg / distilAlhubert

Star

code for our paper DistilALHuBERT: A Distilled Parameter Sharing Audio Representation Model

asr distillation hubert

Updated Mar 15, 2023
Python

yaya-sy / speechscorer

Star

unsupervised spoken utterances scoring

speech speech-recognition whisper self-supervised-learning speech-translation hubert

Updated Nov 21, 2023
Python

sadPororo / UniPool-SV

Star

Universal Pooling Method for Speaker Verification Utilizing Pre-trained Multi-layer Features, 2025 preprint

pretrained-models speaker-recognition speaker-verification hubert wav2vec2 wavlm

Updated Sep 19, 2024
Python

TerboucheHacene / speech-keyword-spotting

Star

Speech Keyword detection using Wav2Vec Model

transformers pytorch audio-classification keyword-spotting audio-processing fine-tuning onnx pytorch-lightning hubert wav2vec2

Updated Nov 23, 2022
Python

aitor-alvarez / acoustic-transformer-models

Star

Acoustic Transformer Models for Audio Classification

classification acoustic transformer-models pytorch-lightning hubert wav2vec2 wavlm

Updated Feb 15, 2025
Python

GiovaneIwamoto / voice-cloning-bark-hubert

Star

🐶 Voice Cloning Bark HuBERT - Enables voice cloning from personalized audio samples by processing model's outputs into semantic tokens compatible with text-to-audio system.

tts bark voice-cloning hubert

Updated Oct 22, 2024
Python

akash13s / audio-to-image

Star

Pipeline for generating images conditioned on input audio

pytorch u-net diffusion-models hubert wav2vec2

Updated Jul 25, 2024
Python

anilkeshwani / speech-text-alignment

Star

Functionality for speech data processing including time alignment, encoding with speech encoders (tokenizers) and data preprocessing of common datasets

speech speech-recognition data-pipeline asr hubert uroman

Updated Jan 28, 2025
Python

omkar-nitsure / Accent-Adaptation-Codebooks

Star

This repository contains different approaches I tried for improving ASR systems for accented English speech. All of them use the HuBERT model as baseline

transformer attention asr-model codebook-approach hubert

Updated Dec 6, 2024
Python

Improve this page

Add a description, image, and links to the hubert topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hubert topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hubert

Here are 14 public repositories matching this topic...

voicepaw / so-vits-svc-fork

s3prl / s3prl

lstrgar / self-supervised-phone-segmentation

ECNU-Cross-Innovation-Lab / ShiftSER

mjhydri / Singing-Vocal-Beat-Tracking

backspacetg / distilAlhubert

yaya-sy / speechscorer

sadPororo / UniPool-SV

TerboucheHacene / speech-keyword-spotting

aitor-alvarez / acoustic-transformer-models

GiovaneIwamoto / voice-cloning-bark-hubert

akash13s / audio-to-image

anilkeshwani / speech-text-alignment

omkar-nitsure / Accent-Adaptation-Codebooks

Improve this page

Add this topic to your repo