Cloud cost engineering

Multi-cloud arbitrage in practice (with a worked example)

Real worked example for a mid-sized SaaS: stateless tier on Hetzner, managed Postgres on AWS RDS, CDN on Cloudflare. 60% cheaper than single-cloud AWS.

cloudprice editorial ~4 min read

"Multi-cloud" as a strategy gets a lot of grief, mostly because the worst-case interpretation — running every workload on every cloud for portability — is genuinely a waste of money. The version that works is narrower: route each workload to the provider that prices its dominant cost factor most cheaply.

Here's a real worked example from a SaaS I helped re-architect in 2024-2025.

The starting point: pure AWS

The workload:

  • Stateless API tier: ~20 EC2 instances (m6i.large) behind an ALB
  • Background workers: ~30 EC2 instances (c6i.xlarge)
  • Postgres primary + 2 replicas: RDS db.r6i.4xlarge, 2 TB gp3
  • Redis: ElastiCache 3-node cluster (cache.r6g.xlarge)
  • Object storage: 50 TB in S3
  • Egress: ~30 TB/month (lots of image-heavy responses)
  • Data transfer between services + AZs: hard to measure, plenty

Monthly AWS bill, list prices:

ComponentMonthly USD
EC2 API tier (20 × m6i.large)$1,400
EC2 workers (30 × c6i.xlarge)$3,720
EBS (gp3 + io2 for DB)$1,100
RDS r6i.4xlarge primary + 2 replicas (multi-AZ)$2,800
ElastiCache 3 × cache.r6g.xlarge$580
NAT gateway × 3 AZs + data processing$220
ALB + WAF$60
S3 storage (50 TB)$1,150
Egress (30 TB)$2,500
CloudWatch Logs + monitoring$450
Inter-AZ traffic$180
Total$14,160

With a 3-year Compute Savings Plan covering compute (about 27% off): about $11,800/month.

The arbitrage architecture

The principle: keep the painful-to-replace pieces on AWS (managed Postgres, IAM, Secrets Manager). Move bandwidth-heavy and compute-bulk pieces to a cheaper provider. Put a CDN in front of egress.

What moved to Hetzner

  • API tier: 8 × CCX13 (4 vCPU AMD EPYC, 16 GB RAM, dedicated) = €54.40/month total
  • Background workers: 12 × CCX23 (8 vCPU, 32 GB RAM) = €270/month total
  • Both run k3s with a small control plane on 3 × CPX31 in HA = €43.50/month

The workers do fewer concurrent jobs per instance but the dedicated EPYC cores are faster per-vCPU than the shared c6i.xlarge, so total throughput roughly matches.

What stayed on AWS

  • RDS Postgres primary + 2 replicas. Painful to self-host with the same operational guarantees.
  • ElastiCache. Could move to self-hosted Redis, but the team explicitly chose to avoid running stateful infra.
  • Secrets Manager + IAM + SES (transactional email).
  • S3 for archival data and large file uploads.

What moved to Cloudflare

  • CDN for all public traffic (free at this volume, Pro plan $20/month for additional features).
  • R2 for hot static assets (zero egress, $0.015/GB-month) — about 5 TB of frequently-accessed assets moved out of S3.
  • Workers for edge auth, simple API routes (~$5/month).

How the two clouds talk

Tailscale subnet routers on both sides. Hetzner workers reach RDS via Tailscale, with cross-cloud latency around 5-8 ms (Hetzner Falkenstein to AWS Frankfurt). For high-throughput batch jobs that need lots of data from RDS, we use a read replica on AWS and the workers reach that.

Cross-cloud egress: Hetzner-to-AWS isn't free, but it's a small fraction of total egress. The bulk of public-facing egress is now served from Cloudflare's edge with no charge.

The new bill

ComponentMonthly USD
Hetzner API tier (8 × CCX13)$59
Hetzner workers (12 × CCX23)$293
Hetzner k3s control plane$48
Hetzner load balancers (2)$12
RDS r6i.4xlarge primary + 2 replicas$2,800
ElastiCache 3 × cache.r6g.xlarge$580
S3 storage (45 TB after moving 5 TB to R2)$1,035
R2 storage (5 TB)$75
AWS egress (now only DB replication + cross-cloud)$300
Cloudflare CDN + Workers$25
Tailscale (team plan)$60
Reduced CloudWatch (less compute on AWS)$120
NAT + ALB (much smaller scope)$80
Total$5,487

From $11,800/month (AWS with 3-year Savings Plan) to $5,487/month. About 53% reduction. Annual saving: ~$76,000.

What it cost to get there

  • ~6 engineer-weeks of work to set up Hetzner infra (Terraform, k3s, monitoring).
  • ~2 weeks to migrate stateless services, including building multi-arch container images.
  • ~1 week of Cloudflare CDN + R2 cutover.
  • Some operational learning curve: handling Hetzner outages (rare but they happen), debugging cross-cloud network issues.

Payback: about 5 weeks of savings to cover the engineering cost. After that, pure margin.

What we'd do differently

  • Self-host Postgres earlier. RDS at $2,800/month is dominant in the new bill. A CCX43 (16 vCPU, 64 GB RAM, 480 GB NVMe) on Hetzner runs €107/month. With a sensible replica setup and Barman for PITR, full Postgres HA for <$200/month all-in is achievable. The team's hesitation was operational, not financial.
  • Move Redis to self-hosted earlier. Same reasoning.
  • Don't try to make this work below ~$1500/month total spend. The engineering overhead isn't worth it for small workloads.

When NOT to do this

  • You're regulated and need every component on a single audited provider.
  • Your team is small (<5 engineers) and you genuinely don't have the bandwidth for cross-cloud ops.
  • Your total spend is <$2K/month — the engineering cost dominates the savings.
  • You're heavily invested in cloud-specific services (Lambda, Step Functions, DynamoDB) and migration cost is large.

The general pattern

Bulk-compute and bandwidth-heavy workloads → cheap provider. Stateful managed services → premium provider where the operational maturity is worth the money. Edge / CDN → Cloudflare. The 50-60% savings shows up reliably when the workload shape matches.

To model your own: pick the candidate split, plug each component into the TCO calculator, and compare against pure-hyperscaler in AWS vs Hetzner. The crossover happens earlier than most people expect.

Try it yourself
Compare list prices across all seven providers, side by side. Live snapshot updated regularly.