A terminal-based voice memo application built with Go and Bubble Tea.
Current Version: v1.0.8 - Latest release with real-time waveform visualization and audio processing features.
The main interface showing the memo list, ASCII art speaker visualization, and help information.
Audio configuration interface displaying hardware/audio settings, available devices, and help.
- Record audio using PortAudio with real-time waveform visualization
- Playback with real-time controls and waveform display
- WAV file format support with automatic post-processing
- Configurable audio devices and settings
- Test tone generation (440Hz sine wave)
- Real-time clipping detection with visual warnings
- Automatic silence trimming and audio normalization
- List view with navigation
- Rename memos
- Add tags for organization
- Delete memos
- Export memos to Downloads folder
- Optional transcription with multiple provider support
- Terminal user interface using Bubble Tea
- Keyboard navigation
- Settings screen for audio configuration and processing options
- Help screen with keybindings
- ASCII art speaker visualization with two-tone coloring
- Professional color scheme with rounded borders
- Adaptive layout with real-time audio visualizer
- Real-time peak level meters and VU meters during recording
Download the latest release from GitHub Releases:
- Windows (amd64):
voicelog-v1.0.8-windows-amd64.zip
- Linux (amd64):
voicelog-v1.0.8-linux-amd64.tar.gz
- Download
voicelog-v1.0.8-windows-amd64.zip
- Extract the archive
- Run
voicelog-windows-amd64.exe
- Download
voicelog-v1.0.8-linux-amd64.tar.gz
- Extract:
tar -xzf voicelog-v1.0.8-linux-amd64.tar.gz
- Install PortAudio:
sudo apt-get install libportaudio2
- Run:
./voicelog-linux-amd64
- Go 1.25 or later
- PortAudio development libraries
pacman -S mingw-w64-x86_64-portaudio
sudo apt-get install libportaudio2 portaudio19-dev
# Clone the repository
git clone https://github.com/Cod-e-Codes/voicelog.git
cd voicelog
# Download dependencies
go mod download
# Build the binary
go build -o voicelog main.go
# Run
./voicelog
Key | Action |
---|---|
SPACE |
Start/Stop recording |
ENTER |
Play/Pause selected memo |
↑/↓ |
Navigate memo list |
ctrl+r |
Rename memo |
ctrl+g |
Add tag |
ctrl+d |
Delete memo |
ctrl+e |
Export memo |
ctrl+x |
Stop playback |
? |
Show help |
ctrl+s |
Settings |
ctrl+t |
Transcribe selected memo |
F5 |
Generate test file |
ESC/q |
Quit |
- Recording: Press
SPACE
to start/stop recording - Playback: Select a memo and press
ENTER
to play - Transcription: Press
ctrl+t
to transcribe selected memo (optional) - Settings: Press
ctrl+s
to configure audio devices and transcription - Test File: Press
F5
to generate a 5-second 440Hz test tone - Export: Press
ctrl+e
to export selected memo to Downloads folder
VoiceLog includes advanced audio processing capabilities:
- Waveform Display: Live waveform visualization during recording and playback
- Peak Level Meters: Monitor input levels with color-coded peak indicators (during recording)
- VU Meters: Left/right channel level monitoring (during recording)
- Clipping Detection: Visual warnings when audio levels exceed thresholds (during recording)
- Silence Trimming: Automatically removes silence from beginning and end of recordings
- Audio Normalization: Amplifies recordings to optimal levels (configurable target)
- Configurable Thresholds: Adjust silence detection and clipping thresholds in settings
- Smart Layout: Interface adapts during recording/playback to show visualizer
- Compact Mode: Memo list becomes compact when audio visualizer is active
- Real-Time Updates: Waveform and meters update in real-time during operation
VoiceLog supports optional voice-to-text transcription through a flexible plugin system. Transcription is completely optional - the application works perfectly without it.
-
whisper.cpp (Recommended - Local & Private)
- High accuracy, supports many languages
- Runs entirely offline - no internet required
- Complete privacy - audio never leaves your machine
- Installation: github.com/ggerganov/whisper.cpp
-
OpenAI Whisper API (Cloud-based - Highest Accuracy)
- Highest accuracy available
- Requires internet connection and API key
- Install:
pip install openai
- Set
OPENAI_API_KEY
environment variable
-
Vosk (Lightweight & Fast)
- Smaller models, faster processing
- Good for real-time applications
- Installation: alphacephei.com/vosk
-
Custom Python Script
- Use any transcription API (AssemblyAI, Rev.ai, etc.)
- Write your own integration script
- Full flexibility for custom workflows
whisper.cpp Setup (Linux/macOS):
# Clone and build whisper.cpp
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp && make
# Download a model (base.en for English, base for multilingual)
./models/download-ggml-model.sh base.en
# The whisper binary will be auto-detected by VoiceLog
OpenAI Whisper API Setup:
# Install the OpenAI library
pip install openai
# Set your API key (get one from https://platform.openai.com)
export OPENAI_API_KEY="your-api-key-here"
- Enable in Settings: Press
ctrl+s
→ Navigate to "Transcription:" → Toggle to ON - Select Provider: Navigate to "Default Provider:" → Choose your installed provider
- Transcribe: Press
ctrl+t
on any memo to transcribe it - Auto-Transcribe: Enable "Auto Transcribe:" to automatically transcribe new recordings
- Visual Indicators: Transcribed memos show a 📝 icon in the memo list
- Search Integration: Search through transcribed text using the built-in filter
- Provider Status: Settings show ✓/✗ status for each provider's availability
- Flexible Configuration: Each provider can be configured independently
- Auto-Detection: VoiceLog automatically detects available transcription tools
- Local Options: whisper.cpp and Vosk run entirely on your machine
- Cloud Options: OpenAI Whisper API provides highest accuracy but requires internet
- No Telemetry: VoiceLog never sends any data anywhere (except when using API providers)
- Storage: Transcriptions are stored locally alongside memo metadata
Configuration is stored in ~/.voicelog/config.json
and includes:
- Audio device settings
- Sample rate and format preferences
- Audio processing settings (normalization, silence trimming, clipping detection)
- Transcription settings (optional)
- Memo storage path
- Keybindings
~/.voicelog/
├── config.json # Application configuration
├── transcription.json # Transcription settings (if enabled)
├── memos/ # Voice memo storage
│ ├── metadata.json # Memo metadata (includes transcriptions)
│ └── memo_*.wav # Audio files
└── voicelog.log # Application logs
Built with:
- Bubble Tea - TUI framework
- PortAudio - Audio I/O
- Go - Programming language
- WSL (Windows Subsystem for Linux): ALSA errors occur due to missing audio device access. WSL doesn't provide direct access to Windows audio devices.
- Windows Standalone: Missing
libportaudio.dll
when running the pre-built binary outside of MSYS2 environment. - Recording Issues: Audio recording may not work properly in some environments, though playback and device detection work correctly.
- For WSL: Use the Windows version instead, as WSL doesn't support direct audio device access.
- For Windows: Run from MSYS2 environment or ensure PortAudio libraries are properly installed.
- For Linux: Ensure you have proper audio device permissions and ALSA/PulseAudio configured.
This project is a work in progress and contributions are welcome! If you encounter issues or have improvements to suggest, please:
- Check existing issues on GitHub
- Create a new issue with detailed information about your environment
- Submit pull requests for bug fixes or new features
This project is licensed under the MIT License - see the LICENSE file for details.