Slide Stream 2.0

🎬 Professional AI-powered video presentations from Markdown and PowerPoint files.

Transform your content into stunning video presentations with AI-generated images, premium text-to-speech, and smart content enhancement. Version 2.0 introduces a modern configuration system and professional-grade providers.

✨ What's New in 2.0

🖼️ AI Image Generation: DALL-E 3 creates custom images for your slides
🎙️ Premium Voices: ElevenLabs delivers studio-quality narration
📸 Stock Photos: Pexels & Unsplash integration with API keys
⚙️ Configuration System: YAML-based setup with environment variables
🎯 Simplified CLI: Clean commands, no more option overload
🔄 Smart Fallbacks: Graceful degradation when services unavailable

🚀 Quick Start

1. Installation

# Install with all AI providers
pip install slide-stream[all-ai]

# Or install with specific providers
pip install slide-stream[openai,elevenlabs]

2. Setup Configuration

# Create configuration file
slide-stream init

# Check available providers
slide-stream providers

3. Configure API Keys

Set environment variables for the services you want to use:

# For AI image generation
export OPENAI_API_KEY="your-openai-key"

# For premium text-to-speech  
export ELEVENLABS_API_KEY="your-elevenlabs-key"

# For stock photos (optional)
export PEXELS_API_KEY="your-pexels-key"
export UNSPLASH_ACCESS_KEY="your-unsplash-key"

4. Create Your First Video

# Create from Markdown
slide-stream create presentation.md output.mp4

# Create from PowerPoint
slide-stream create slides.pptx video.mp4

🎯 Usage Examples

Basic Video Creation

# Simple creation (uses default config)
slide-stream create slides.md presentation.mp4

# With custom configuration
slide-stream create --config my-config.yaml presentation.pptx video.mp4

Example Markdown File

# Welcome to AI-First Development

- Build smarter applications with integrated AI
- Learn practical implementation patterns
- Deploy production-ready solutions

# Why Choose AI-First?

- Faster development cycles
- Enhanced user experiences  
- Competitive advantage in the market

# Getting Started

- Set up your development environment
- Choose the right AI services
- Build your first AI-powered feature

⚙️ Configuration

SlideStream uses YAML configuration files for maximum flexibility:

# slidestream.yaml
providers:
  llm:
    provider: openai        # Content enhancement
    model: gpt-4o-mini
    
  images:
    provider: dalle3        # AI-generated images
    fallback: text         # Fallback when DALL-E unavailable
    
  tts:
    provider: elevenlabs   # Premium text-to-speech
    voice: rachel          # Voice selection

# API Keys (use environment variables for security)
api_keys:
  openai: "${OPENAI_API_KEY}"
  elevenlabs: "${ELEVENLABS_API_KEY}"
  pexels: "${PEXELS_API_KEY}"
  unsplash: "${UNSPLASH_ACCESS_KEY}"

settings:
  video:
    resolution: [1920, 1080]
    fps: 24
    codec: libx264
  cleanup: true

Configuration Discovery

SlideStream automatically finds your config in this order:

./slidestream.yaml (current directory)
~/.slidestream.yaml (home directory)
Built-in defaults

🔧 Available Providers

Image Providers

Provider	Description	Requirements
`dalle3`	AI image generation via DALL-E 3	OpenAI API key
`pexels`	Professional stock photos	Pexels API key
`unsplash`	High-quality stock photos	Unsplash API key
`text`	Text-based slides (always available)	None

Text-to-Speech Providers

Provider	Description	Requirements
`elevenlabs`	Premium AI voices with emotion	ElevenLabs API key
`openai`	Natural OpenAI TTS voices	OpenAI API key
`gtts`	Google Text-to-Speech (free)	None

LLM Providers

Provider	Description	Requirements
`openai`	GPT models for content enhancement	OpenAI API key
`gemini`	Google Gemini models	Gemini API key
`claude`	Anthropic Claude models	Anthropic API key
`groq`	Fast inference with Groq	Groq API key
`ollama`	Local models via Ollama	Ollama installation

📋 CLI Commands

Core Commands

# Create video presentation
slide-stream create <input_file> <output_file>

# Generate example configuration
slide-stream init [config_file]

# List available providers and their status
slide-stream providers

# Show help
slide-stream --help

Examples

# Basic usage
slide-stream create slides.md presentation.mp4

# With custom config
slide-stream create --config prod.yaml deck.pptx video.mp4

# Check what's available
slide-stream providers

# Create config file
slide-stream init my-config.yaml

🎨 Advanced Features

PowerPoint Integration

Slide Content: Extracts titles, bullet points, and images
Speaker Notes: Uses notes for enhanced AI narration
Layouts: Preserves slide structure and hierarchy

AI Enhancement

Content Improvement: LLMs enhance slide text for better flow
Image Generation: DALL-E 3 creates relevant, professional images
Voice Selection: Choose from multiple TTS voices and styles

Professional Output

HD Video: 1920x1080 resolution by default
Quality Audio: Synchronized speech with proper timing
Custom Timing: Configurable slide durations and padding

🔑 Getting API Keys

OpenAI (for DALL-E 3 & GPT)

Visit OpenAI Platform
Sign up and create an API key
Add billing method (pay-per-use)

ElevenLabs (for Premium TTS)

Visit ElevenLabs
Create account and get API key
Choose from 900+ voices

Pexels (for Stock Photos)

Visit Pexels API
Sign up for free API access
Get your API key

Unsplash (for Stock Photos)

Visit Unsplash Developers
Create application
Get your access key

📦 Installation Options

# Core package only
pip install slide-stream

# With specific AI providers
pip install slide-stream[openai]
pip install slide-stream[elevenlabs]
pip install slide-stream[gemini]
pip install slide-stream[claude]
pip install slide-stream[groq]

# All AI providers
pip install slide-stream[all-ai]

# Development dependencies
pip install slide-stream[dev]

📋 Requirements

Python: 3.10 or higher
FFmpeg: For video processing
Internet: For AI services and stock photos (offline mode available)

Installing FFmpeg

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# Windows
# Download from https://ffmpeg.org/download.html

🔧 Configuration Reference

Provider Options

Image Providers:

text: Always available, no setup required
dalle3: Requires OPENAI_API_KEY
pexels: Requires PEXELS_API_KEY
unsplash: Requires UNSPLASH_ACCESS_KEY

TTS Providers:

gtts: Free, always available
elevenlabs: Requires ELEVENLABS_API_KEY
openai: Requires OPENAI_API_KEY

LLM Providers:

none: No content enhancement
openai: Requires OPENAI_API_KEY
gemini: Requires GEMINI_API_KEY
claude: Requires ANTHROPIC_API_KEY
groq: Requires GROQ_API_KEY
ollama: Requires local Ollama installation

Voice Options

ElevenLabs Voices:

rachel: Professional female voice
adam: Clear male voice
aria: Expressive female voice
(See ElevenLabs docs for full list)

OpenAI Voices:

alloy: Balanced and natural
echo: Clear and articulate
fable: Warm and engaging
nova: Bright and energetic
onyx: Deep and authoritative
shimmer: Gentle and soothing

🤝 Contributing

We welcome contributions! See our documentation:

User Guide - Comprehensive usage examples
Development Workflow - Setup and testing
Type Safety - Code quality standards

🆕 Version History

2.0.0: Configuration system, provider architecture, AI image generation
1.1.x: PowerPoint support, bug fixes, stability improvements
1.0.0: Initial release with Markdown support

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

Built with these excellent tools:

Typer - Modern CLI framework
Rich - Beautiful terminal output
MoviePy - Video processing
OpenAI - AI image generation and LLM
ElevenLabs - Premium text-to-speech
PyYAML - Configuration parsing

Ready to create professional presentations? Get started with pip install slide-stream[all-ai] 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
src/slide_stream		src/slide_stream
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

License

michael-borck/slide-stream

Folders and files

Latest commit

History

Repository files navigation