Skip to content

ContextLab/orchestrator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

96 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Orchestrator Framework

PyPI Version Python Versions Downloads License: MIT Tests Coverage Documentation

Overview

Orchestrator is a powerful, flexible AI pipeline orchestration framework that simplifies the creation and execution of complex AI workflows. By combining YAML-based configuration with intelligent model selection and automatic ambiguity resolution, Orchestrator makes it easy to build sophisticated AI applications without getting bogged down in implementation details.

Key Features

  • 🎯 YAML-Based Pipelines: Define complex workflows in simple, readable YAML with full template variable support
  • πŸ€– Multi-Model Support: Seamlessly work with OpenAI, Anthropic, Google, Ollama, and HuggingFace models
  • 🧠 Intelligent Model Selection: Automatically choose the best model based on task requirements
  • πŸ”„ Automatic Ambiguity Resolution: Use <AUTO> tags to let AI resolve configuration ambiguities
  • πŸ“¦ Modular Architecture: Extend with custom models, tools, and control systems
  • πŸ›‘οΈ Production Ready: Built-in error handling, retries, checkpointing, and monitoring
  • ⚑ Parallel Execution: Efficient resource management and parallel task execution
  • 🐳 Sandboxed Execution: Secure code execution in isolated environments
  • πŸ’Ύ Lazy Model Loading: Models are downloaded only when needed, saving disk space
  • πŸ”§ Reliable Tool Execution: Guaranteed execution of file operations with LangChain structured outputs
  • πŸ“ Advanced Templates: Support for nested variables, filters, and Jinja2-style templates

Quick Start

Installation

pip install py-orc

For additional features:

pip install py-orc[ollama]      # Ollama model support
pip install py-orc[cloud]        # Cloud model providers
pip install py-orc[dev]          # Development tools
pip install py-orc[all]          # Everything

Basic Usage

  1. Create a simple pipeline (hello_world.yaml):
id: hello_world
name: Hello World Pipeline
description: A simple example pipeline

steps:
  - id: greet
    action: generate_text
    parameters:
      prompt: "Say hello to the world in a creative way!"
      
  - id: translate
    action: generate_text
    parameters:
      prompt: "Translate this greeting to Spanish: {{ greet.result }}"
    dependencies: [greet]

outputs:
  greeting: "{{ greet.result }}"
  spanish: "{{ translate.result }}"
  1. Run the pipeline:
# Using the CLI script
python scripts/run_pipeline.py hello_world.yaml

# With inputs
python scripts/run_pipeline.py hello_world.yaml -i name=World -i language=Spanish

# From a JSON file
python scripts/run_pipeline.py hello_world.yaml -f inputs.json -o output_dir/

# Or programmatically
import orchestrator as orc

# Initialize models (auto-detects available models)
orc.init_models()

# Compile and run the pipeline
pipeline = orc.compile("hello_world.yaml")
result = pipeline.run()

print(result)

Using AUTO Tags

Orchestrator's <AUTO> tags let AI decide configuration details:

steps:
  - id: analyze_data
    action: analyze
    parameters:
      data: "{{ input_data }}"
      method: <AUTO>Choose the best analysis method for this data type</AUTO>
      visualization: <AUTO>Decide if we should create a chart</AUTO>

Model Configuration

Configure available models in models.yaml:

models:
  # Local models (via Ollama) - downloaded on first use
  - source: ollama
    name: llama3.1:8b
    expertise: [general, reasoning, multilingual]
    size: 8b
    
  - source: ollama
    name: qwen2.5-coder:7b
    expertise: [code, programming]
    size: 7b

  # Cloud models
  - source: openai
    name: gpt-4o
    expertise: [general, reasoning, code, analysis, vision]
    size: 1760b  # Estimated

defaults:
  expertise_preferences:
    code: qwen2.5-coder:7b
    reasoning: deepseek-r1:8b
    fast: llama3.2:1b

Models are downloaded only when first used, saving disk space and initialization time.

Advanced Example

Here's a more complex example showing model requirements and parallel execution:

id: research_pipeline
name: AI Research Pipeline
description: Research a topic and create a comprehensive report

inputs:
  - name: topic
    type: string
    description: Research topic
    
  - name: depth
    type: string
    default: <AUTO>Determine appropriate research depth</AUTO>

steps:
  # Parallel research from multiple sources
  - id: web_search
    action: search_web
    parameters:
      query: "{{ topic }} latest research 2025"
      count: <AUTO>Decide how many results to fetch</AUTO>
    requires_model:
      expertise: [research, web]
      
  - id: academic_search
    action: search_academic
    parameters:
      query: "{{ topic }}"
      filters: <AUTO>Set appropriate academic filters</AUTO>
    requires_model:
      expertise: [research, academic]
      
  # Analyze findings with specialized model
  - id: analyze_findings
    action: analyze
    parameters:
      web_results: "{{ web_search.results }}"
      academic_results: "{{ academic_search.results }}"
      analysis_focus: <AUTO>Determine key aspects to analyze</AUTO>
    dependencies: [web_search, academic_search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 20b  # Require large model for complex analysis
      
  # Generate report
  - id: write_report
    action: generate_document
    parameters:
      topic: "{{ topic }}"
      analysis: "{{ analyze_findings.result }}"
      style: <AUTO>Choose appropriate writing style</AUTO>
      length: <AUTO>Determine optimal report length</AUTO>
    dependencies: [analyze_findings]
    requires_model:
      expertise: [writing, general]

outputs:
  report: "{{ write_report.document }}"
  summary: "{{ analyze_findings.summary }}"

Complete Example: Research Report Generator

Here's a fully functional pipeline that generates research reports:

# research_report.yaml
id: research_report
name: Research Report Generator
description: Generate comprehensive research reports with citations

inputs:
  - name: topic
    type: string
    description: Research topic
  - name: instructions
    type: string
    description: Additional instructions for the report

outputs:
  - pdf: <AUTO>Generate appropriate filename for the research report PDF</AUTO>

steps:
  - id: search
    name: Web Search
    action: search_web
    parameters:
      query: <AUTO>Create effective search query for {topic} with {instructions}</AUTO>
      max_results: 10
    requires_model:
      expertise: fast
      
  - id: compile_notes
    name: Compile Research Notes
    action: generate_text
    parameters:
      prompt: |
        Compile comprehensive research notes from these search results:
        {{ search.results }}
        
        Topic: {{ topic }}
        Instructions: {{ instructions }}
        
        Create detailed notes with:
        - Key findings
        - Important quotes
        - Source citations
        - Relevant statistics
    dependencies: [search]
    requires_model:
      expertise: [analysis, reasoning]
      min_size: 7b
      
  - id: write_report
    name: Write Report
    action: generate_document
    parameters:
      content: |
        Write a comprehensive research report on "{{ topic }}"
        
        Research notes:
        {{ compile_notes.result }}
        
        Requirements:
        - Professional academic style
        - Include introduction, body sections, and conclusion
        - Cite sources properly
        - {{ instructions }}
      format: markdown
    dependencies: [compile_notes]
    requires_model:
      expertise: [writing, general]
      min_size: 20b
      
  - id: create_pdf
    name: Create PDF
    action: convert_to_pdf
    parameters:
      markdown: "{{ write_report.document }}"
      filename: "{{ outputs.pdf }}"
    dependencies: [write_report]

Run it with:

import orchestrator as orc

# Initialize models
orc.init_models()

# Compile pipeline
pipeline = orc.compile("research_report.yaml")

# Run with inputs
result = pipeline.run(
    topic="quantum computing applications in medicine",
    instructions="Focus on recent breakthroughs and future potential"
)

print(f"Report saved to: {result}")

Documentation

Comprehensive documentation is available at orc.readthedocs.io, including:

Available Models

Orchestrator supports a wide range of models:

Local Models (via Ollama)

  • Gemma3 27B: Google's powerful general-purpose model
  • Llama 3.x: General purpose, multilingual support
  • DeepSeek-R1: Advanced reasoning and coding
  • Qwen2.5-Coder: Specialized for code generation
  • Mistral: Fast and efficient general purpose

Cloud Models

  • OpenAI: GPT-4.1 (latest)
  • Anthropic: Claude Sonnet 4 (claude-sonnet-4-20250514)
  • Google: Gemini 2.5 Flash (gemini-2.5-flash)

HuggingFace Models

  • Mistral 7B Instruct v0.3: High-quality instruction-following model
  • Llama, Qwen, Phi, and many more
  • Automatically downloaded on first use

Requirements

  • Python 3.8+
  • Optional: Ollama for local model execution
  • Optional: API keys for cloud providers (OpenAI, Anthropic, Google)

Contributing

We welcome contributions! Please see our Contributing Guide for details.

Support

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use Orchestrator in your research, please cite:

@software{orchestrator2025,
  title = {Orchestrator: AI Pipeline Orchestration Framework},
  author = {Manning, Jeremy R. and {Contextual Dynamics Lab}},
  year = {2025},
  url = {https://github.com/ContextLab/orchestrator},
  organization = {Dartmouth College}
}

Acknowledgments

Orchestrator is developed and maintained by the Contextual Dynamics Lab at Dartmouth College.


Built with ❀️ by the Contextual Dynamics Lab

About

A convenient wrapper for LangGraph, MCP, model spec, and other AI agent control systems

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages