A powerful subtitle file converter that ensures proper UTF-8 encoding with robust support for Arabic and other languages. SubZilla automatically detects the input file encoding and converts it to UTF-8, making it perfect for fixing subtitle encoding issues. Built with SOLID, YAGNI, KISS, and DRY principles in mind.
- Automatic encoding detection.
- Converts subtitle files to UTF-8.
- Supports multiple subtitle formats (
.srt
,.sub
,.txt
). - Strong support for Arabic and other non-Latin scripts.
- Simple command-line interface.
- Batch processing with glob pattern support.
- Parallel processing for better performance.
- Preserves original file formatting.
- Creates backup of original files.
- Node.js (v14 or higher)
- Yarn package manager
# Install globally using yarn
yarn global add subzilla
# Or using npm
npm install -g subzilla
# Clone the repository
git clone https://github.com/onyxdevs/subzilla.git
cd subzilla
# Install dependencies
yarn install
# Build the project
yarn build
# Link for local development
yarn link
# Convert a single subtitle file
subzilla convert path/to/subtitle.srt
# The converted file will be saved as path/to/subtitle.utf8.srt
# Strip HTML formatting
subzilla convert input.srt --strip-html
# Strip color codes
subzilla convert input.srt --strip-colors
# Strip style tags
subzilla convert input.srt --strip-styles
# Replace URLs with [URL]
subzilla convert input.srt --strip-urls
# Strip all formatting
subzilla convert input.srt --strip-all
# Create backup and strip formatting
subzilla convert input.srt -b --strip-all
# Combine multiple strip options
subzilla convert input.srt --strip-html --strip-colors
Convert multiple subtitle files at once using glob patterns:
# Convert all .srt files in current directory
subzilla batch "*.srt"
# Convert files recursively in all subdirectories
subzilla batch "**/*.srt" -r
# Convert multiple formats
subzilla batch "**/*.{srt,sub,txt}" -r
# Specify output directory
subzilla batch "**/*.srt" -o converted/
# Process files in parallel for better performance
subzilla batch "**/*.srt" -p
# Skip existing UTF-8 files
subzilla batch "**/*.srt" -s
# Combine basic options for maximum efficiency
subzilla batch "**/*.{srt,sub,txt}" -r -p -s -o converted/
# Advanced Directory Processing
# Limit recursive depth to 2 levels
subzilla batch "**/*.srt" -r -d 2
# Only process files in specific directories
subzilla batch "**/*.srt" -r -i "movies" "series"
# Exclude specific directories
subzilla batch "**/*.srt" -r -x "temp" "backup"
# Preserve directory structure in output
subzilla batch "**/*.srt" -r -o converted/ --preserve-structure
# Complex example combining all features
subzilla batch "**/*.{srt,sub,txt}" -r -p -s -o converted/ \
-d 3 -i "movies" "series" -x "temp" "backup" --preserve-structure
# Strip formatting in batch mode
subzilla batch "**/*.srt" -r --strip-all
# Strip specific formatting in batch mode
subzilla batch "**/*.srt" -r --strip-html --strip-colors
# Create backups and strip formatting
subzilla batch "**/*.srt" -r -b --strip-all
# Complex example with formatting options
subzilla batch "**/*.{srt,sub,txt}" -r -p -s -o converted/ \
-d 3 -i "movies" "series" -x "temp" "backup" \
--preserve-structure --strip-all -b
Options:
-o, --output-dir <dir>
: Save converted files to specified directory.-r, --recursive
: Search for files in subdirectories.-p, --parallel
: Process files in parallel (faster for many files).-s, --skip-existing
: Skip files that already have a UTF-8 version.-d, --max-depth <depth>
: Maximum directory depth for recursive search.-i, --include-dirs <dirs...>
: Only process files in these directories.-x, --exclude-dirs <dirs...>
: Exclude files in these directories.--preserve-structure
: Preserve directory structure in output.-b, --backup
: Create backup of original files.--strip-html
: Strip HTML tags.--strip-colors
: Strip color codes.--strip-styles
: Strip style tags.--strip-urls
: Replace URLs with [URL].--strip-all
: Strip all formatting (equivalent to all strip options).
Features:
- Progress bar showing conversion status.
- Per-directory progress tracking.
- Detailed statistics after completion.
- Error tracking and reporting.
- Parallel processing support.
- Skip existing files option.
- Time tracking and performance metrics.
- Directory structure preservation.
- Directory filtering and depth control.
- HTML tag stripping.
- Color code removal.
- Style tag removal.
- URL replacement.
- Whitespace normalization.
- Original file backup.
Example Output:
π Found 25 files in 5 directories...
Converting |==========| 100% | 25/25 | Total Progress
Converting |==========| 100% | 8/8 | Processing movies
Converting |==========| 100% | 7/7 | Processing series/season1
Converting |==========| 100% | 5/5 | Processing series/season2
Converting |==========| 100% | 3/3 | Processing series/specials
Converting |==========| 100% | 2/2 | Processing extras
π Batch Processing Summary:
ββββββββββββββββββββββββββ
Total files processed: 25
Directories processed: 5
β
Successfully converted: 23
β Failed: 1
βοΈ Skipped: 1
β±οΈ Total time: 5.32s
β‘ Average time per file: 0.22s
π Directory Statistics:
ββββββββββββββββββββ
movies:
Total: 8
β
Success: 8
β Failed: 0
βοΈ Skipped: 0
series/season1:
Total: 7
β
Success: 6
β Failed: 1
βοΈ Skipped: 0
series/season2:
Total: 5
β
Success: 5
β Failed: 0
βοΈ Skipped: 0
series/specials:
Total: 3
β
Success: 2
β Failed: 0
βοΈ Skipped: 1
extras:
Total: 2
β
Success: 2
β Failed: 0
βοΈ Skipped: 0
β Errors:
βββββββββ
series/season1/broken.srt: Failed to detect encoding
# Specify output file (single file conversion)
subzilla convert input.srt -o output.srt
# Get help
subzilla --help
# Get version
subzilla --version
# Get help for specific command
subzilla convert --help
subzilla batch --help
SubZilla supports flexible configuration through YAML files and environment variables. All settings are optional with sensible defaults.
SubZilla looks for configuration files in the following order:
- Path specified via
--config
option .subzillarc
in the current directory.subzilla.yml
or.subzilla.yaml
subzilla.config.yml
orsubzilla.config.yaml
Several example configurations are provided in the examples/config
directory:
-
Full Configuration (
.subzillarc
):input: encoding: auto # auto, utf8, utf16le, utf16be, ascii, windows1256 format: auto # auto, srt, sub, ass, ssa, txt output: directory: ./converted # Output directory path createBackup: true # Create backup of original files format: srt # Output format encoding: utf8 # Always UTF-8 bom: false # Add BOM to output files lineEndings: lf # lf, crlf, or auto # ... and more settings
-
Minimal Configuration (
minimal.subzillarc
):input: encoding: auto format: auto output: directory: ./converted createBackup: true format: srt strip: html: true colors: true styles: true batch: recursive: true parallel: true skipExisting: true preserveStructure: true # Maintain directory structure chunkSize: 5
-
Performance-Optimized (
performance.subzillarc
):output: createBackup: false # Skip backups overwriteInput: true # Overwrite input files overwriteExisting: true # Don't check existing files batch: parallel: true preserveStructure: false # Flat output structure chunkSize: 20 # Larger chunks retryCount: 0 # No retries failFast: true # Stop on first error
-
Arabic-Optimized (
arabic.subzillarc
):input: encoding: windows1256 # Common Arabic encoding output: bom: true # Add BOM for compatibility lineEndings: crlf # Windows line endings batch: includeDirectories: - arabic - Ω Ψ³ΩΨ³ΩΨ§Ψͺ - Ψ£ΩΩΨ§Ω
You can also configure SubZilla using environment variables. Copy .env.example
to .env
and modify as needed:
# Input Settings
SUBZILLA_INPUT_ENCODING=utf8
SUBZILLA_INPUT_FORMAT=srt
SUBZILLA_INPUT_DEFAULT_LANGUAGE=ar
# Output Settings
SUBZILLA_OUTPUT_DIRECTORY=./output
SUBZILLA_OUTPUT_CREATE_BACKUP=true
# Complex settings use JSON
SUBZILLA_STRIP='{"html":true,"colors":true,"styles":true}'
SUBZILLA_BATCH_INCLUDE_DIRECTORIES='["movies","series"]'
Settings are merged in the following order (later ones override earlier ones):
- Default values.
- Configuration file.
- Environment variables.
- Command-line arguments.
encoding
: Input file encoding (auto
,utf8
,utf16le
,utf16be
,ascii
,windows1256
).format
: Input format (auto
,srt
,sub
,ass
,ssa
,txt
).
directory
: Output directory path.createBackup
: Create backup of original files.format
: Output format.encoding
: Output encoding (alwaysutf8
).bom
: Add BOM to output files.lineEndings
: Line ending style (lf
,crlf
,auto
).overwriteInput
: Overwrite input files.overwriteExisting
: Overwrite existing files.
html
: Remove HTML tags.colors
: Remove color codes.styles
: Remove style tags.urls
: Replace URLs with[URL]
.timestamps
: Replace timestamps with[TIMESTAMP]
.numbers
: Replace numbers with#
.punctuation
: Remove punctuation.emojis
: Replace emojis with[EMOJI]
.brackets
: Remove brackets.
recursive
: Process subdirectories.parallel
: Process files in parallel.skipExisting
: Skip existing UTF-8 files.maxDepth
: Maximum directory depth.includeDirectories
: Only process these directories.excludeDirectories
: Skip these directories.preserveStructure
: Maintain directory structure.chunkSize
: Files per batch.retryCount
: Number of retry attempts.retryDelay
: Delay between retries (ms).failFast
: Stop on first error.
subzilla/
βββ src/
β βββ cli/ # Command-line interface
β βββ core/ # Core conversion logic
β βββ utils/ # Utility functions
β βββ types/ # TypeScript type definitions
βββ test/ # Test files
βββ dist/ # Compiled output
βββ package.json # Project configuration
yarn build
: Build the project.yarn start
: Run the CLI.yarn dev
: Run in development mode.yarn test
: Run tests.yarn lint
: Run linter.yarn lint:fix
: Fix linting issues.yarn format
: Format code using Prettier.yarn format:check
: Check code formatting.
- Fork the repository.
- Create your feature branch (
git checkout -b feature/amazing-feature
). - Commit your changes (
git commit -m 'Add some amazing feature'
). - Push to the branch (
git push origin feature/amazing-feature
). - Open a Pull Request.
This project is licensed under the ISC License - see the LICENSE file for details.
If you encounter any issues or have questions, please:
- Check the issues page
- Create a new issue if your problem isn't already listed
- Provide as much detail as possible, including:
- SubZilla version
- Node.js version
- Operating system
- Sample subtitle file (if possible)
- Thanks to all contributors.
- Inspired by the need for better subtitle encoding support.
- Built with TypeScript and Node.js.
Planned improvements and feature additions:
-
Enhanced Format Support
- Add support for
.ass
and.ssa
subtitle formats. - Handle multiple subtitle files in batch.
- Support subtitle format conversion.
- Add support for
-
User Interface
- Add interactive CLI mode.
- Implement progress bars for large files.
- Create a web interface.
-
Performance Optimization
- Implement parallel processing for batch operations.
- Optimize memory usage.
- Add batch processing progress tracking.
- Add batch processing statistics and reporting.
- Add configurable chunk size for parallel processing.
- Implement retry mechanism for failed conversions.
-
Additional Features
- Add subtitle validation.
- Implement timing adjustment.
- Support subtitle merging.
- Add character encoding preview.
- Add batch processing statistics and reporting.
- Add JSON output format for statistics.
- Add CSV export for batch results.
- AI translation of subtitles.
-
Developer Experience
- Add comprehensive tests.
- Improve error messages.
- Create detailed API documentation.
- Add GitHub Actions workflow.
- Add batch processing examples and test cases.
- Add performance benchmarking tools.
- Create batch processing configuration files.
Want to contribute to these enhancements? Check our Contributing section!