A list of cloud hosting services offering resources for AI inference and fine-tuning:
Service | Focus | GPU Types | Pricing | Additional Features |
---|---|---|---|---|
Banana.dev | High-throughput inference | Not specified | Scales to zero | Deploys from GitHub repo |
Fal.ai | Inference, image and audio generation | A100s, A10Gs, T4s | Scales to zero | Playgrounds and shared endpoints |
Replicate.com | Text, image, music, speech generation | A100s, A40s, T4s | Scales to zero | Preconfigured models and playgrounds |
Lepton.ai | Hosting and inference via CLI | Not specified | Scales to zero | Deploys HuggingFace models |
TitanML.co | CLI and enterprise inference deployments | Not specified | Not specified | - |
Anyscale.com | Inference and fine-tuning | A10s, A100s, H100s (soon) | Not specified | Shared endpoints for various models |
Together.ai | Inference and fine-tuning | A40s, A100s, H100s | Scales to zero | 68 shared endpoints and playgrounds |
Brev.dev | Hosting and CLI tools | A100s, H100s, others | Varies, Free for existing cloud | Deploys to AWS and GCP |
Gradient.ai | Fine-tuning via CLI/Python/NodeJS | Not specified | Not specified | Online console for job management |
Fireworks.ai | CLI inference platform | Not specified | Varies | Shared endpoints, supports LoRA addons |