Cloud cost engineering

Burst credit instances (T-family, e2, B-series): when they save money, when they hurt

AWS t3, GCP e2, Azure B-series — the burst-credit pricing model is brilliant for some workloads and a hidden tax for others. Worked examples both ways.

cloudprice editorial ~4 min read

AWS T-family, GCP e2 (shared-core), and Azure B-series all share a pricing model that confuses people: you pay for a "baseline" CPU percentage and accumulate credits when idle, then burn credits when you need more CPU. Run out of credits and either (a) your performance drops to baseline, or (b) you start paying a per-vCPU-hour "unlimited" surcharge.

This model is great for some workloads. Catastrophic for others. The difference is whether your traffic actually idles enough to refill the credit bucket.

How credit pricing actually works

Take AWS t3.medium: 2 vCPU, 4 GB RAM, $0.0416/hour. Baseline performance is 20% per vCPU, so 0.4 vCPU sustained. Above that, you burn credits at the rate of 1 credit per vCPU-minute. Credits accrue at 24/hour when idle, with a maximum balance of 576 (24-hour worth).

If you launch with the default "T3 Unlimited" mode and run out of credits, you pay $0.05 per vCPU-hour for the burst time. Two vCPUs at full throttle for an extra hour = $0.10. Sounds trivial. Until it runs 24/7.

A t3.medium running constantly at 80% CPU (4x baseline) earns 9.6 credits/hour and burns 96 credits/hour. Net burn: 86 credits/hour. The 576-credit balance lasts ~6.7 hours, then you're in unlimited mode paying the surcharge.

That surcharge on a sustained 80%-CPU t3.medium works out to about $5.16/month on top of the $30.37 base. And your performance is now the same as a fixed-allocation instance.

Where burst credit is the right answer

  • Web servers with peaky traffic but low baseline. A small blog or marketing site doing 5% average CPU but 90% during the daily traffic spike. T3 nails this — credits accumulate overnight, burn during peak, total cost is well under a fixed-allocation equivalent.
  • Cron / batch boxes that idle 22 hours a day. The box that runs a 30-minute job at 3am and idles the rest. T3 unlimited keeps it on, credits never deplete.
  • Bastion / jump hosts. 99% of the time idle. Occasional SSH sessions burst the CPU briefly. T3.nano is perfect.
  • Dev environments. Most dev boxes are idle most of the time but need bursts during compilation. T3 covers this.
  • Postgres / MySQL read replicas with modest load. Reads spike but tail off. Idle credits absorb the spikes.

Where burst credit hurts

  • Background workers consuming jobs from a queue. If the queue is busy, you're at 80%+ CPU 24/7. Credits never recover. You're paying unlimited surcharge constantly. A fixed-allocation c6i.large at $0.085/hour is cheaper than a t3.medium burning credits at $0.0416 + $0.10/hour surcharge ($0.14/hour) and slower.
  • ML inference servers running steady traffic. Same issue. Steady CPU load is the worst case.
  • Database primaries with sustained write load. Don't.
  • Anything where consistent latency matters. The moment credits run out and you fall back to baseline, your tail latency explodes. p99 latency on a credit-depleted T3 is multiples of the same instance with credits available.

The "T3 unlimited mode" silent bill

AWS launches T3 with "unlimited" mode by default. This means if your workload accidentally exceeds baseline for sustained periods, the per-vCPU surcharge silently appears on your bill under CPUCredits:t3. I have seen accounts where this hit several thousand dollars/month before anyone noticed.

The fix: either configure T3 in "standard" mode (no surcharge, performance falls to baseline if credits exhaust) or set up a CloudWatch alarm on CPUSurplusCreditsCharged to catch the bleed early.

GCP e2-micro / e2-small specifics

GCP e2 family is similar but the "burst" mechanism is implicit — the shared-core instances (e2-micro, e2-small, e2-medium) have a baseline and can burst above it, but GCP doesn't charge a separate surcharge. Instead, when the host is busy, your CPU just slows down. This is more honest but also less predictable for production.

For production workloads on GCP, jump straight to e2-standard-* or n2-standard-* rather than the shared-core e2-micro/small. The pricing per vCPU is similar but you get fixed allocation.

Azure B-series specifics

Azure B-series uses an explicit credit model very similar to AWS T3. Credits accrue at a fixed rate per VM size; you can monitor and alert. There is no "unlimited" mode — if you run out of credits, you fall back to baseline period.

This is more predictable than T3 unlimited but means you really have to be sure your workload fits the burst profile, because there's no escape valve.

The decision rule

If average daily CPU utilisation across the instance is <20%, burst credit is almost always cheapest. If it's 20-50%, do the math — sometimes T3 unlimited is still cheaper than a dedicated instance, sometimes not. If it's >50% sustained, never use burst credit; jump to a fixed-allocation instance.

What "20% average" actually looks like

People underestimate sustained load. A typical Rails / Django web server pushing 100 requests/second can easily average 30-40% CPU even at "quiet" hours, because of background workers, scheduled jobs, and base OS overhead. A typical Postgres read replica with light traffic averages 20-30% CPU on garbage collection and autovacuum alone. A K8s node with a dozen pods running idle services often sits at 25-35% just on the kubelet + container-runtime overhead.

Before picking T3 / B / e2-shared, look at actual CloudWatch / Stackdriver metrics. If you don't have any, run a fixed-allocation instance for two weeks, gather the data, then switch.

The cheapest viable production tier — what we use

For small production sites: t3.small in standard mode + a CloudWatch alarm on CPU credits below 50. If credits drop, the alarm fires and either we scale up or we know to tune.

For workers / background tasks at any sustained load: c6i.large or m6i.large fixed allocation. The 20-30% higher hourly rate is more predictable.

To compare bursty vs fixed-allocation pricing per instance, see the cloudprice catalogue. The AWS provider page separates T-family from M/C/R families explicitly. Also look at Hetzner where there are no burst-credit instances — everything is dedicated and pricing is consistent.

External: AWS T-instance docs, GCP e2 docs.

Try it yourself
Compare list prices across all seven providers, side by side. Live snapshot updated regularly.