FreeScribe is a modern, open-source transcription and translation web application that leverages on-device machine learning models, running entirely in your browser using Web Workers. Users can record or upload audio, transcribe speech to text, translate between languages, and export the results β all with privacy and speed, without sending data to any backend server.
- Live-Demo: https://free-scribe-arnob.vercel.app/
- Project Summary
- Features
- Technology Stack
- Project Structure
- How It Works
- Getting Started
- Usage Walkthrough
- Teaching Content & Examples
- Keywords
- Conclusion
- License
- ποΈ Audio Input: Record live or upload MP3/WAV files for transcription.
- βοΈ Transcription: Converts speech to text using ML models (OpenAI Whisper).
- π Translation: Translate transcribed text into multiple languages.
- β‘ Runs Locally: All ML inference runs in-browser via Web Workers for privacy and speed.
- πΎ Export: Download or copy the resulting text.
- π Modern UI: Built with React, Vite, and TailwindCSS.
- π‘ No Cost: 100% free and open-source.
- Frontend: React 18, Vite, TailwindCSS
- Web Worker ML:
@xenova/transformers
- Transcription Model: OpenAI Whisper (via transformers.js)
- Other: ESLint, PostCSS, modern ES2020+ JavaScript
/
βββ public/
β βββ vite.svg # App icon
βββ src/
β βββ components/
β β βββ Header.jsx # Top navigation and branding
β β βββ Footer.jsx # Footer
β β βββ HomePage.jsx # Landing/upload UI
β β βββ FileDisplay.jsx# Audio file display and controls
β β βββ Information.jsx# Output display
β β βββ Transcribing.jsx # Loading/transcribing UI
β βββ utils/
β β βββ presets.js # Worker message types, language codes, model names
β β βββ whisper.worker.js # Main ML Web Worker logic
β βββ App.jsx # Main application logic
β βββ main.jsx # Entry point
β βββ index.css # Tailwind and custom styles
βββ index.html # HTML template
βββ package.json # Dependencies & scripts
βββ ... (config files)
- The app delegates heavy ML inference to a Web Worker (
whisper.worker.js
). This prevents UI blocking and ensures smooth user experience. - The worker receives audio data, loads the ML model (Whisper), and performs transcription/translation asynchronously.
- Communication uses structured messages (see
presets.js
for message types).
- Transcription uses the OpenAI Whisper model, via
@xenova/transformers
, running entirely in-browser (no server needed). - Translation is performed using Whisperβs multilingual capabilities and language codes defined in
presets.js
. - Model progress and results are streamed back to the main app for display.
-
Clone the repo:
git clone https://github.com/arnobt78/FreeScribe-Transcription-Translation-ML-App--ReactVite.git cd FreeScribe-Transcription-Translation-ML-App--ReactVite
-
Install Node.js:
Download and install from nodejs.org. -
Install dependencies:
npm install
-
Install Transformers.js:
npm i @xenova/transformers
Start the development server:
npm run dev
Open http://localhost:5173/ in your browser.
-
Home Screen:
Select to record audio or upload an MP3/WAV file. -
Audio Processing:
Once uploaded or recorded, the file is displayed. Click "Transcribe" to start. -
ML Inference:
The app loads the Whisper model in a web worker and processes your audio. -
View & Translate:
The transcribed text appears. Use translation options to convert it into another language. -
Export or Copy:
Download the text as a file or copy it to your clipboard.
To add a new translation language, extend the LANGUAGES
object in src/utils/presets.js
:
export const LANGUAGES = {
...,
"Spanish": "spa_Latn",
// Add more as needed
};
The worker is initialized in App.jsx
:
worker.current = new Worker(new URL('./utils/whisper.worker.js', import.meta.url), { type: 'module' });
worker.current.postMessage({
type: MessageTypes.INFERENCE_REQUEST,
audio,
model_name: 'openai/whisper-tiny.en'
});
The worker receives audio, runs the model, and sends back results via postMessage
.
- Transcription
- Translation
- Machine Learning
- React
- Vite
- TailwindCSS
- Web Worker
- OpenAI Whisper
- Speech Recognition
- @xenova/transformers
- In-browser ML
- Audio Processing
FreeScribe streamlines advanced speech-to-text and language translationβdirectly in your browser, for free. Powered by modern frontend tools and the latest open-source ML models, itβs a practical, privacy-respecting alternative to expensive SaaS solutions.
MIT License. Β© 2030 arnobt78
Feel free to use this Project Repository and extend this project further!
If you have any questions or want to share your work, reach out via GitHub or my portfolio https://arnob-mahmud.vercel.app/.
Enjoy building and learning! π
Thank you! π