updates.json

{
    "1.0.0": "Initial release",
    "1.0.1": "- Fixed leftover '_16khz' in filenames in transcribe tool\n- Fixed global ffmpeg path reference-changed to bundled path\n- Fixed some path issues in the training config\n- Added specific error message for when no audio files are detected for training FastPitch\n- Flushed training graphs to file a bit more often",
    "1.0.2": "- Hotfix for HiFi-GAN training bug; Other HiFiGAN tweaks\n- Better dir watching support, for UI graph refreshing\n- More/Better error logging",
    "1.0.3": "- Added training config toggle for FP16 use\n- Made base/websocket servers' ports configurable\n- Made the model export accept a choice to do without HiFi-GAN\n- Fixed bug in exporting inference",
    "1.0.4": "- Fixes for Export inference ffmpeg path\n- Lowered max cache size, to lower RAM use during training\n- Improved exporting task order priority",
    "1.0.5": "- New tool: Cut padding\n- Added dataset duplicate detection, searching, and management system\n- Added seconds to training log timestamps\n- Fixed server.py app version\n- Made num workers configurable, for stuck issues\n- Added more app.log logging\n- Training params tweaks\n- Added confirm message to app close, if training\n- Fixed dataset viewer rows broken interaction after search\n- Fixed not being able to edit voice Id for export\n- Made export checkpoints directory accept the root ckpt dir, like in training config\n- Pre-filled the export ckpts dir with existing, if the dataset is already in the training queue\n- Added caching to export output dir\n- Fixed bug where dataset rows sometimes didn't update when changing datasets\n- Made Ctrl+S move down to the next line in the dataset rows\n- Added hover tooltips to the dataset records' cells\n- Made graphs robust to disk polling fails\n- Only bring up the config menu when opening training menu from the dataset section if it's not in the queue already",
    "1.0.6": "- Added .srt split tool\n- Stopped spikes prematurely ending training\n- Adjusted the target delta values\n- Added option to only show the latest configurable few graph points\n- Added exception for .ini files in tools\n- Fixed wrong path being used as default in export\n- Better UI scaling for the training menu\n- Fixed occasional endless websocket spinner on app refresh\n- Removed milliseconds from log timings\n- Fixed record delete button referencing original dataset index, rather than search filtered rows index\n- Better reference clearing on training stop for more stable restarting\n- Made the graph update with the new delta value as soon as it's logged in the text\n- More debug logging around betabinomial tensor mismatch errors\n- Misc tweaks, optimizations",
    "1.1.0": "- Added multi-lingual support for the Auto-Transcribe tool \n- Changed [male]/[female] fine-tune checkpoint selection to checkboxes \n- Fixed WER 'Check text quality' colour results persistence in UI \n- Fixed cut padding tool not handling files with spaces in their paths (credit: @Pendrokar) \n- Added filepicker buttons for folder paths \n- Added cross-session caching of user preferences for training configs \n- Added caching for graph window viewing size setting \n- Fixed cluster tool prefix being numbers only \n- Fixed .srt tool not outputting the first split \n- Fixed .srt tool not handling files with spaces in their paths \n- Changed training numbers to not use scientific notation \n- CSS / UI / cmd log visual tweaks \n- Misc small bug fixes",
    "1.1.1": "- Hotfix for new training errors\n- Improved male/female checkpoint config caching",
    "1.1.2": "- Added [Open] button to training panel, to open the checkpoints directory\n- Fixed training config defaults being blank before caching\n- Added option to AI speaker diarization tool to output Audacity time labels\n- Exposed more errors to UI error windows rather than silent errors. More app.log logging details\n- Added error window for if there's no corpus data given to similarityTool\n- Fixed locale encoding issues with transcript file writing\n- Made xVATrainer window remember position in desktop cross-session\n- Added [copy to clipboard] button to error modals",
    "1.1.3": "- New tool: Make .srt \n- Re-worked dataset voice ID input \n- Fixed main app paging, broken after saving a record \n- Fixed hidden windows .ini files from breaking some tools \n- fixed pyannote hubconf error, via the speaker diarization tool \n- Fixed audacity format output for speaker diarization. Adjusted format \n- Added error message for when no files of the correct type were given to a tool \n- Fixed forcing stage 5 not working, when training \n- Misc tweaks/fixes ",
    "1.2.0": "- Added training support for the v3 models\n- Added support for whisper models for automatic speech-to-text transcription\n- Removed Wav2Vec2 ASR models\n- Added automatic audio formatting and audio normalization tools' effects to dataset pre-processing\n- Skip audio pre-processing if the files are all already there\n- Changed audio normalization tool to also convert stereo to mono\n- Removed main screen audio pre-processing button\n- Fixed voice exporting\n- Fixed end-of-training breaking the UI\n- Misc bug fixes",
    "1.2.1": "- Reduced maximum system RAM consumption during training \n- Fixed UI broken after clearing training queue \n- Added languages trained on in priors into the voice json \n- Stronger VRAM manual management for lower consumption during training \n- Added extra error message for missing PRIORS data \n- Added better handling of embedding clustering on lower spec systems for big datasets \n- Fixed missing dependencies not bundled into compilation from new python environment \n- Fixed clustering tool \n- Temporarily removed speaker diarization tool \n- Fixed regex replace errors "
}