Multi-cloud arbitrage in practice (with a worked example)
Real worked example for a mid-sized SaaS: stateless tier on Hetzner, managed Postgres on AWS RDS, CDN on Cloudflare. 60% cheaper than single-cloud AWS.
"Multi-cloud" as a strategy gets a lot of grief, mostly because the worst-case interpretation — running every workload on every cloud for portability — is genuinely a waste of money. The version that works is narrower: route each workload to the provider that prices its dominant cost factor most cheaply.
Here's a real worked example from a SaaS I helped re-architect in 2024-2025.
The starting point: pure AWS
The workload:
- Stateless API tier: ~20 EC2 instances (
m6i.large) behind an ALB - Background workers: ~30 EC2 instances (
c6i.xlarge) - Postgres primary + 2 replicas: RDS
db.r6i.4xlarge, 2 TB gp3 - Redis: ElastiCache 3-node cluster (
cache.r6g.xlarge) - Object storage: 50 TB in S3
- Egress: ~30 TB/month (lots of image-heavy responses)
- Data transfer between services + AZs: hard to measure, plenty
Monthly AWS bill, list prices:
| Component | Monthly USD |
|---|---|
| EC2 API tier (20 × m6i.large) | $1,400 |
| EC2 workers (30 × c6i.xlarge) | $3,720 |
| EBS (gp3 + io2 for DB) | $1,100 |
| RDS r6i.4xlarge primary + 2 replicas (multi-AZ) | $2,800 |
| ElastiCache 3 × cache.r6g.xlarge | $580 |
| NAT gateway × 3 AZs + data processing | $220 |
| ALB + WAF | $60 |
| S3 storage (50 TB) | $1,150 |
| Egress (30 TB) | $2,500 |
| CloudWatch Logs + monitoring | $450 |
| Inter-AZ traffic | $180 |
| Total | $14,160 |
With a 3-year Compute Savings Plan covering compute (about 27% off): about $11,800/month.
The arbitrage architecture
The principle: keep the painful-to-replace pieces on AWS (managed Postgres, IAM, Secrets Manager). Move bandwidth-heavy and compute-bulk pieces to a cheaper provider. Put a CDN in front of egress.
What moved to Hetzner
- API tier: 8 ×
CCX13(4 vCPU AMD EPYC, 16 GB RAM, dedicated) = €54.40/month total - Background workers: 12 ×
CCX23(8 vCPU, 32 GB RAM) = €270/month total - Both run k3s with a small control plane on 3 ×
CPX31in HA = €43.50/month
The workers do fewer concurrent jobs per instance but the dedicated EPYC cores are faster per-vCPU than the shared c6i.xlarge, so total throughput roughly matches.
What stayed on AWS
- RDS Postgres primary + 2 replicas. Painful to self-host with the same operational guarantees.
- ElastiCache. Could move to self-hosted Redis, but the team explicitly chose to avoid running stateful infra.
- Secrets Manager + IAM + SES (transactional email).
- S3 for archival data and large file uploads.
What moved to Cloudflare
- CDN for all public traffic (free at this volume, Pro plan $20/month for additional features).
- R2 for hot static assets (zero egress, $0.015/GB-month) — about 5 TB of frequently-accessed assets moved out of S3.
- Workers for edge auth, simple API routes (~$5/month).
How the two clouds talk
Tailscale subnet routers on both sides. Hetzner workers reach RDS via Tailscale, with cross-cloud latency around 5-8 ms (Hetzner Falkenstein to AWS Frankfurt). For high-throughput batch jobs that need lots of data from RDS, we use a read replica on AWS and the workers reach that.
Cross-cloud egress: Hetzner-to-AWS isn't free, but it's a small fraction of total egress. The bulk of public-facing egress is now served from Cloudflare's edge with no charge.
The new bill
| Component | Monthly USD |
|---|---|
| Hetzner API tier (8 × CCX13) | $59 |
| Hetzner workers (12 × CCX23) | $293 |
| Hetzner k3s control plane | $48 |
| Hetzner load balancers (2) | $12 |
| RDS r6i.4xlarge primary + 2 replicas | $2,800 |
| ElastiCache 3 × cache.r6g.xlarge | $580 |
| S3 storage (45 TB after moving 5 TB to R2) | $1,035 |
| R2 storage (5 TB) | $75 |
| AWS egress (now only DB replication + cross-cloud) | $300 |
| Cloudflare CDN + Workers | $25 |
| Tailscale (team plan) | $60 |
| Reduced CloudWatch (less compute on AWS) | $120 |
| NAT + ALB (much smaller scope) | $80 |
| Total | $5,487 |
From $11,800/month (AWS with 3-year Savings Plan) to $5,487/month. About 53% reduction. Annual saving: ~$76,000.
What it cost to get there
- ~6 engineer-weeks of work to set up Hetzner infra (Terraform, k3s, monitoring).
- ~2 weeks to migrate stateless services, including building multi-arch container images.
- ~1 week of Cloudflare CDN + R2 cutover.
- Some operational learning curve: handling Hetzner outages (rare but they happen), debugging cross-cloud network issues.
Payback: about 5 weeks of savings to cover the engineering cost. After that, pure margin.
What we'd do differently
- Self-host Postgres earlier. RDS at $2,800/month is dominant in the new bill. A
CCX43(16 vCPU, 64 GB RAM, 480 GB NVMe) on Hetzner runs €107/month. With a sensible replica setup and Barman for PITR, full Postgres HA for <$200/month all-in is achievable. The team's hesitation was operational, not financial. - Move Redis to self-hosted earlier. Same reasoning.
- Don't try to make this work below ~$1500/month total spend. The engineering overhead isn't worth it for small workloads.
When NOT to do this
- You're regulated and need every component on a single audited provider.
- Your team is small (<5 engineers) and you genuinely don't have the bandwidth for cross-cloud ops.
- Your total spend is <$2K/month — the engineering cost dominates the savings.
- You're heavily invested in cloud-specific services (Lambda, Step Functions, DynamoDB) and migration cost is large.
The general pattern
Bulk-compute and bandwidth-heavy workloads → cheap provider. Stateful managed services → premium provider where the operational maturity is worth the money. Edge / CDN → Cloudflare. The 50-60% savings shows up reliably when the workload shape matches.
To model your own: pick the candidate split, plug each component into the TCO calculator, and compare against pure-hyperscaler in AWS vs Hetzner. The crossover happens earlier than most people expect.