🎬 Professional AI-powered video presentations from Markdown and PowerPoint files.
Transform your content into stunning video presentations with AI-generated images, premium text-to-speech, and smart content enhancement. Version 2.0 introduces a modern configuration system and professional-grade providers.
- 🖼️ AI Image Generation: DALL-E 3 creates custom images for your slides
- 🎙️ Premium Voices: ElevenLabs delivers studio-quality narration
- 📸 Stock Photos: Pexels & Unsplash integration with API keys
- ⚙️ Configuration System: YAML-based setup with environment variables
- 🎯 Simplified CLI: Clean commands, no more option overload
- 🔄 Smart Fallbacks: Graceful degradation when services unavailable
# Install with all AI providers
pip install slide-stream[all-ai]
# Or install with specific providers
pip install slide-stream[openai,elevenlabs]
# Create configuration file
slide-stream init
# Check available providers
slide-stream providers
Set environment variables for the services you want to use:
# For AI image generation
export OPENAI_API_KEY="your-openai-key"
# For premium text-to-speech
export ELEVENLABS_API_KEY="your-elevenlabs-key"
# For stock photos (optional)
export PEXELS_API_KEY="your-pexels-key"
export UNSPLASH_ACCESS_KEY="your-unsplash-key"
# Create from Markdown
slide-stream create presentation.md output.mp4
# Create from PowerPoint
slide-stream create slides.pptx video.mp4
# Simple creation (uses default config)
slide-stream create slides.md presentation.mp4
# With custom configuration
slide-stream create --config my-config.yaml presentation.pptx video.mp4
# Welcome to AI-First Development
- Build smarter applications with integrated AI
- Learn practical implementation patterns
- Deploy production-ready solutions
# Why Choose AI-First?
- Faster development cycles
- Enhanced user experiences
- Competitive advantage in the market
# Getting Started
- Set up your development environment
- Choose the right AI services
- Build your first AI-powered feature
SlideStream uses YAML configuration files for maximum flexibility:
# slidestream.yaml
providers:
llm:
provider: openai # Content enhancement
model: gpt-4o-mini
images:
provider: dalle3 # AI-generated images
fallback: text # Fallback when DALL-E unavailable
tts:
provider: elevenlabs # Premium text-to-speech
voice: rachel # Voice selection
# API Keys (use environment variables for security)
api_keys:
openai: "${OPENAI_API_KEY}"
elevenlabs: "${ELEVENLABS_API_KEY}"
pexels: "${PEXELS_API_KEY}"
unsplash: "${UNSPLASH_ACCESS_KEY}"
settings:
video:
resolution: [1920, 1080]
fps: 24
codec: libx264
cleanup: true
SlideStream automatically finds your config in this order:
./slidestream.yaml
(current directory)~/.slidestream.yaml
(home directory)- Built-in defaults
Provider | Description | Requirements |
---|---|---|
dalle3 |
AI image generation via DALL-E 3 | OpenAI API key |
pexels |
Professional stock photos | Pexels API key |
unsplash |
High-quality stock photos | Unsplash API key |
text |
Text-based slides (always available) | None |
Provider | Description | Requirements |
---|---|---|
elevenlabs |
Premium AI voices with emotion | ElevenLabs API key |
openai |
Natural OpenAI TTS voices | OpenAI API key |
gtts |
Google Text-to-Speech (free) | None |
Provider | Description | Requirements |
---|---|---|
openai |
GPT models for content enhancement | OpenAI API key |
gemini |
Google Gemini models | Gemini API key |
claude |
Anthropic Claude models | Anthropic API key |
groq |
Fast inference with Groq | Groq API key |
ollama |
Local models via Ollama | Ollama installation |
# Create video presentation
slide-stream create <input_file> <output_file>
# Generate example configuration
slide-stream init [config_file]
# List available providers and their status
slide-stream providers
# Show help
slide-stream --help
# Basic usage
slide-stream create slides.md presentation.mp4
# With custom config
slide-stream create --config prod.yaml deck.pptx video.mp4
# Check what's available
slide-stream providers
# Create config file
slide-stream init my-config.yaml
- Slide Content: Extracts titles, bullet points, and images
- Speaker Notes: Uses notes for enhanced AI narration
- Layouts: Preserves slide structure and hierarchy
- Content Improvement: LLMs enhance slide text for better flow
- Image Generation: DALL-E 3 creates relevant, professional images
- Voice Selection: Choose from multiple TTS voices and styles
- HD Video: 1920x1080 resolution by default
- Quality Audio: Synchronized speech with proper timing
- Custom Timing: Configurable slide durations and padding
- Visit OpenAI Platform
- Sign up and create an API key
- Add billing method (pay-per-use)
- Visit ElevenLabs
- Create account and get API key
- Choose from 900+ voices
- Visit Pexels API
- Sign up for free API access
- Get your API key
- Visit Unsplash Developers
- Create application
- Get your access key
# Core package only
pip install slide-stream
# With specific AI providers
pip install slide-stream[openai]
pip install slide-stream[elevenlabs]
pip install slide-stream[gemini]
pip install slide-stream[claude]
pip install slide-stream[groq]
# All AI providers
pip install slide-stream[all-ai]
# Development dependencies
pip install slide-stream[dev]
- Python: 3.10 or higher
- FFmpeg: For video processing
- Internet: For AI services and stock photos (offline mode available)
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.html
Image Providers:
text
: Always available, no setup requireddalle3
: RequiresOPENAI_API_KEY
pexels
: RequiresPEXELS_API_KEY
unsplash
: RequiresUNSPLASH_ACCESS_KEY
TTS Providers:
gtts
: Free, always availableelevenlabs
: RequiresELEVENLABS_API_KEY
openai
: RequiresOPENAI_API_KEY
LLM Providers:
none
: No content enhancementopenai
: RequiresOPENAI_API_KEY
gemini
: RequiresGEMINI_API_KEY
claude
: RequiresANTHROPIC_API_KEY
groq
: RequiresGROQ_API_KEY
ollama
: Requires local Ollama installation
ElevenLabs Voices:
rachel
: Professional female voiceadam
: Clear male voicearia
: Expressive female voice- (See ElevenLabs docs for full list)
OpenAI Voices:
alloy
: Balanced and naturalecho
: Clear and articulatefable
: Warm and engagingnova
: Bright and energeticonyx
: Deep and authoritativeshimmer
: Gentle and soothing
We welcome contributions! See our documentation:
- User Guide - Comprehensive usage examples
- Development Workflow - Setup and testing
- Type Safety - Code quality standards
- 2.0.0: Configuration system, provider architecture, AI image generation
- 1.1.x: PowerPoint support, bug fixes, stability improvements
- 1.0.0: Initial release with Markdown support
MIT License - see LICENSE file for details.
Built with these excellent tools:
- Typer - Modern CLI framework
- Rich - Beautiful terminal output
- MoviePy - Video processing
- OpenAI - AI image generation and LLM
- ElevenLabs - Premium text-to-speech
- PyYAML - Configuration parsing
Ready to create professional presentations? Get started with pip install slide-stream[all-ai]
🚀