-
-
Notifications
You must be signed in to change notification settings - Fork 108
Backend Synchronization Guide
This guide covers the bidirectional synchronization capabilities between Cloudflare and SQLite-vec backends in MCP Memory Service. These tools enable hybrid deployment strategies combining the speed of local storage with the global availability of cloud storage.
┌─────────────────────────────────────────┐
│ MCP Memory Service │
│ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ SQLite-vec │←→│ Sync Engine │ │
│ │ (Local) │ │ │ │
│ └─────────────┘ └─────────────┘ │
│ ↑ ↓ │
│ │ │ │
│ Fast Access Bidirectional │
│ (5ms reads) Sync │
│ │ │ │
│ ↓ ↓ │
│ ┌─────────────────────────────────┐ │
│ │ Cloudflare Backend │ │
│ │ (D1 + Vectorize + R2) │ │
│ │ Global Distribution │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
- Primary: Cloudflare for global access
- Backup: Local SQLite-vec for offline capability
- Benefit: Resilient, always-available memory service
- Development: Local SQLite-vec for fast iteration
- Production: Cloudflare for scalability
- Benefit: Seamless dev→prod workflow
- Machine A: Local work with periodic sync
- Machine B: Pull shared memories from cloud
- Benefit: Consistent memory across devices
- Regular backups: Automated Cloudflare→SQLite sync
- Recovery: Quick restore from local backup
- Benefit: Zero data loss, minimal downtime
-
Both backends configured:
- SQLite-vec: Default local storage
- Cloudflare: Requires API token and resource IDs
-
Environment files:
# .env (Cloudflare configuration) CLOUDFLARE_API_TOKEN=your-token CLOUDFLARE_ACCOUNT_ID=your-account CLOUDFLARE_D1_DATABASE_ID=your-d1-id CLOUDFLARE_VECTORIZE_INDEX=your-index MCP_MEMORY_STORAGE_BACKEND=cloudflare # .env.sqlite (SQLite configuration) MCP_MEMORY_STORAGE_BACKEND=sqlite_vec MCP_MEMORY_SQLITE_PATH=/path/to/sqlite_vec.db
The sync utilities are located in the scripts/
directory:
# Navigate to project root
cd mcp-memory-service
# Verify sync tools are present
ls scripts/sync_memory_backends.py
ls scripts/claude_sync_commands.py
ls scripts/memory_service_manager.sh
# Using main sync script
python scripts/sync_memory_backends.py --status
# Using convenience wrapper
python scripts/claude_sync_commands.py status
# Example output:
# === Memory Sync Status ===
# Cloudflare memories: 750
# SQLite-vec memories: 745
# Cloudflare configured: True
# SQLite-vec file exists: True
# Last check: 2024-01-15T10:30:00
Always preview before syncing:
# See what would be synced
python scripts/sync_memory_backends.py --dry-run
# Output shows:
# - Memories to add
# - Memories to skip (duplicates)
# - No actual changes made
# Backup cloud memories to local
python scripts/sync_memory_backends.py --direction cf-to-sqlite
# Or using wrapper
python scripts/claude_sync_commands.py backup
# Restore local memories to cloud
python scripts/sync_memory_backends.py --direction sqlite-to-cf
# Or using wrapper
python scripts/claude_sync_commands.py restore
# Sync both directions (merge)
python scripts/sync_memory_backends.py --direction bidirectional
# Or using wrapper
python scripts/claude_sync_commands.py sync
The memory_service_manager.sh
script provides comprehensive service management:
# Start with specific backend
./scripts/memory_service_manager.sh start-cloudflare
./scripts/memory_service_manager.sh start-sqlite
# Check status
./scripts/memory_service_manager.sh status
# Integrated sync operations
./scripts/memory_service_manager.sh sync-backup
./scripts/memory_service_manager.sh sync-restore
./scripts/memory_service_manager.sh sync-both
# Stop service
./scripts/memory_service_manager.sh stop
Set up automated daily backups:
# Edit crontab
crontab -e
# Add daily backup at 2 AM
0 2 * * * cd /path/to/mcp-memory-service && python scripts/sync_memory_backends.py --direction cf-to-sqlite >> /var/log/memory-sync.log 2>&1
# Add hourly bidirectional sync
0 * * * * cd /path/to/mcp-memory-service && python scripts/sync_memory_backends.py --direction bidirectional >> /var/log/memory-sync.log 2>&1
# Specify custom SQLite path
python scripts/sync_memory_backends.py --sqlite-path /custom/path/backup.db --status
# Verbose logging
python scripts/sync_memory_backends.py --verbose --direction bidirectional
The sync engine uses content-based hashing to prevent duplicates:
- Content Hash: SHA256 hash of content + metadata
- Comparison: Check both backends for existing hashes
- Skip Logic: Skip if hash exists in target backend
- Metadata Preservation: All tags, timestamps preserved
- No conflicts: Different memories are merged
- Duplicates: Skipped based on content hash
- Timestamps: Preserved from original creation
- Tags: All tags maintained
- Batch Processing: Memories synced in batches
- Caching: Embedding cache for efficiency
- Network Optimization: Minimal API calls
- Large Datasets: Handles 1000+ memories efficiently
Before syncing, validate your configuration:
# Run configuration validator
python scripts/validate_config.py
# Example output:
# 🔍 MCP Memory Service Configuration Validation
# ==================================================
#
# 1. Environment Configuration Check:
# ✅ .env file has Cloudflare backend configured
#
# 2. Claude Code Global Configuration Check:
# ✅ Found 1 Cloudflare memory configurations
#
# 3. Project-Level Configuration Check:
# ✅ No local .mcp.json found (good - using global configuration)
#
# 4. Cloudflare Credentials Check:
# ✅ All required Cloudflare environment variables found in .env
#
# 🎉 Configuration validation PASSED!
Problem: One backend not initialized Solution:
# Initialize both backends
MCP_MEMORY_STORAGE_BACKEND=cloudflare uv run memory server &
sleep 5 && kill %1
MCP_MEMORY_STORAGE_BACKEND=sqlite_vec uv run memory server &
sleep 5 && kill %1
Problem: Invalid Cloudflare credentials Solution:
# Verify credentials in .env
cat .env | grep CLOUDFLARE
# Test with curl
curl -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/d1/database
Problem: Content hash mismatch Solution:
# Run deduplication on SQLite
python scripts/find_duplicates.py --execute
# Verify sync status
python scripts/sync_memory_backends.py --status
Problem: Large dataset without optimization Solution:
# Use verbose mode to identify bottlenecks
python scripts/sync_memory_backends.py --verbose --dry-run
# Consider splitting large syncs
# First sync recent memories only
- Daily Cloudflare → SQLite backup
- Weekly full bidirectional sync
- Monthly verification of sync integrity
- Always run
--dry-run
first - Check sync status before major operations
- Validate configuration regularly
- Log sync operations
- Monitor memory counts
- Track sync duration trends
- Test sync on development data first
- Verify metadata preservation
- Confirm search functionality post-sync
name: Memory Sync
on:
schedule:
- cron: '0 */6 * * *' # Every 6 hours
workflow_dispatch:
jobs:
sync:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
pip install uv
uv pip install -e .
- name: Sync memories
env:
CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }}
CLOUDFLARE_ACCOUNT_ID: ${{ secrets.CLOUDFLARE_ACCOUNT_ID }}
CLOUDFLARE_D1_DATABASE_ID: ${{ secrets.CLOUDFLARE_D1_DATABASE_ID }}
CLOUDFLARE_VECTORIZE_INDEX: ${{ secrets.CLOUDFLARE_VECTORIZE_INDEX }}
run: |
python scripts/sync_memory_backends.py --direction bidirectional
Planned improvements for sync functionality:
- Selective Sync: Sync by tags or time ranges
- Incremental Sync: Only sync changes since last run
- Conflict Resolution UI: Interactive merge tool
- Multi-Backend Support: Sync with ChromaDB, PostgreSQL
- Compression: Reduce bandwidth for large syncs
- Encryption: End-to-end encryption for sensitive memories
For issues with sync functionality:
- Check configuration with
python scripts/validate_config.py
- Review logs in
/tmp/memory-sync.log
- Open an issue with sync debug output
- Join discussions for community support
Last updated: January 2025 Sync tools version: 1.0.0