AI infra · CapEx alternative · Free calculator

AI / GPU Cloud Cost Calculator

Estimate monthly GPU cloud spend on AWS, GCP, Azure, Lambda, CoreWeave, RunPod, and Modal. Models per-GPU hourly rate, utilization, storage, egress, and on-demand vs reserved pricing.

Disclaimer: Cloud GPU pricing changes constantly and varies by region, contract terms, and discount tier. Use this calculator for budgeting and vendor comparison — confirm specific pricing in your provider's quote before committing.

New here? Watch it work in 2 seconds — then tweak it for you.

Try it like this

Tap a scenario to load realistic numbers, then tweak the sliders.

Number of GPUs8

$/hour per GPU$2.50

H100 on-demand ≈ $2.50–$4.50, A100 ≈ $1.50–$3, L40S ≈ $1–$1.80, RTX 4090 ≈ $0.40–$0.80.

Utilization (% of month GPUs are running)70%

On-demand jobs rarely hit 100%. Reserved/committed-use often does.

Storage (TB)5

Datasets + checkpoints + model weights. NVMe ~$0.10/GB/mo, object storage $0.02/GB/mo.

Storage $/GB/month$0.08

Egress (TB/month)2

AWS/GCP/Azure egress ≈ $0.08–$0.12/GB. Cloudflare/some GPU clouds = $0.

Egress $/GB$0.09

Reserved / committed-use discount %0%

1-yr commit typically saves 30–40%, 3-yr 50–65%.

Monthly GPU cloud spend

$10,800

$129,600/year · $2.50/effective GPU-hr · $0 reserved savings

Running 8 GPUs at $2.50/hr × 511 effective hours/month = $10,220 gross GPU cost. After a 0% reserved discount you pay $10,220 for compute, $400 for 5 TB of storage, and $180 for 2 TB of egress — $10,800/month total, $129,600/year. Three quiet budget killers most teams underestimate: (1) idle time — GPUs left provisioned overnight or on weekends bill at 100% of on-demand; auto-shutdown alone often saves 30–50%, (2) egress — pulling models or datasets across regions or out to the public internet can rival the GPU bill on data-heavy workloads, and (3) checkpoint storage — every 30 minutes × 100GB × 30 days adds up. CoreWeave, Lambda, RunPod, and Modal typically undercut hyperscalers by 40–70% per GPU-hour at the cost of fewer managed services.

Monthly total$10,800

GPU compute (after discount)$10,220

Storage$400

Egress$180

Effective $/GPU-hour$2.50

Reserved savings vs on-demand$0

Annual run-rate$129,600

Formula used

GPU cloud cost formula

GPUs are the headline number, but storage and egress regularly add 10–30% on top. Reserved or committed-use pricing reshapes the entire bill — model both before signing a multi-year contract.

Monthly = (GPUs × $/hr × 730 × Util%) × (1 − Reserved%) + Storage + Egress

H100 on-demand

$2.50–$4.50/hr

Reserved discount

30–65%

Hyperscaler egress

$0.08–$0.12/GB

Backlink-friendly embed

Embed this calculator

Free to embed on any site. Inputs preserved, link back to RevenueLab. Each format trades polish for SEO juice.

WidthHeight (px)Theme

<iframe src="https://revenuelab.fyi/embed/ai-gpu-cloud-cost-calculator?gpuCount=8&hourlyPerGpu=2.5&utilizationPct=70&storageTb=5&storagePricePerGb=0.08&egressTb=2&egressPricePerGb=0.09&reservedDiscountPct=0" width="100%" height="680" style="border:0;border-radius:12px;max-width:100%" loading="lazy" title="AI / GPU Cloud Cost Calculator"></iframe>
<p style="font:12px/1.4 system-ui;color:#666;margin:6px 0 0">Calculator by <a href="https://revenuelab.fyi/ai-gpu-cloud-cost-calculator?gpuCount=8&hourlyPerGpu=2.5&utilizationPct=70&storageTb=5&storagePricePerGb=0.08&egressTb=2&egressPricePerGb=0.09&reservedDiscountPct=0" target="_blank" rel="noopener">RevenueLab</a></p>

Easiest to install — passes referral traffic and a referring-domain signal.

Cite this calculator

Writing about this topic? Grab a citation — every link helps keep these tools free.

APA

RevenueLab. (2026). AI / GPU Cloud Cost Calculator. Retrieved from https://revenuelab.fyi/ai-gpu-cloud-cost-calculator

HTML

<p>Source: <a href="https://revenuelab.fyi/ai-gpu-cloud-cost-calculator" target="_blank" rel="noopener">AI / GPU Cloud Cost Calculator — RevenueLab</a> (2026).</p>

Markdown

Source: [AI / GPU Cloud Cost Calculator — RevenueLab](https://revenuelab.fyi/ai-gpu-cloud-cost-calculator) (2026).

On-demand vs reserved vs spot

On-demand is the most expensive way to buy GPU-hours. 1-year commits typically save 30–40%; 3-year saves 50–65%. Spot/preemptible can shave another 50–70% on top but get evicted mid-run — only viable for fault-tolerant training and batch inference. Most production inference clusters land on 1-year reserved for the steady-state baseline plus on-demand burst for traffic spikes.

• On-demand: zero commitment, full hourly rate. Best for evaluation and bursty workloads.
• Reserved: 30–65% discount in exchange for 1–3 year commitment. Best for steady inference.
• Spot/preemptible: 50–70% additional discount, eviction risk. Best for distributed training with checkpointing.

Why neoclouds keep winning workloads

CoreWeave, Lambda Labs, Crusoe, RunPod, Modal, and Together typically charge 40–70% less per GPU-hour than AWS/GCP/Azure for the same hardware. They win on price-per-hour and lose on the managed-service surface area (no Bedrock, no Vertex, fewer compliance attestations). For pure GPU workloads, the math usually pencils in their favor; for tightly-integrated multi-service apps, the hyperscaler premium can still be worth it.

Professor Revenue Rex pointing at a chalkboard of formulas

Rex's Notes

GPU pricing is the most opaque corner of cloud infrastructure. The same H100 costs $2.49/hr on Lambda, $4.20/hr on neoclouds, $8.39/hr on AWS on-demand, and $3.50/hr on a 1-year AWS reserved commit — for identical silicon. This calculator translates GPU type, utilization, and provider into a real monthly bill, then layers reserved discounts and egress so you can compare apples to apples before signing a $200k contract.

What each input means

Get these inputs right and the output is reliable. Get them wrong and the calculator just multiplies bad assumptions.

GPU count

Active GPUs you need running concurrently at peak. Multiply training jobs × replicas + serving fleet.

Typical range: 1–8 for fine-tuning; 8–64 for serious training; 4–32 for production inference clusters.

Hourly price per GPU

On-demand list price. Reserved/spot are separate inputs. H100 is the modern reference point.

Typical range: H100: $1.99–$3 (Lambda/neoclouds), $3.50–$5 (1yr reserved hyperscaler), $7–$12 (on-demand hyperscaler). A100: 40–60% of H100. L40S: $1.10–$2.50.

Utilization %

Share of paid hours the GPU is actually crunching. Idle GPUs still bill at full rate.

Typical range: 30–60% for serving with bursty traffic; 70–95% for training; <20% means you bought too many.

Reserved discount %

Discount from list price for 1–3yr commitments. Spot pricing can be deeper but with eviction risk.

Typical range: 1yr reserved: 30–45%. 3yr: 50–65%. Spot: 60–90% off but interruptions kill long training jobs.

Egress GB/month

Data leaving the cloud — model artifacts, dataset downloads, inference responses going to other clouds.

Typical range: 100GB–10TB for inference apps; 50–500TB during data prep for training.

Worked examples

Real scenarios with the math walked through line by line.

Example

Inference cluster, 8× H100 on a neocloud

Scenario: 8 H100s at $2.49/hr on Lambda or similar, 55% utilization, no reserved commit, 5TB egress.

Math: Compute = 8 × $2.49 × 730h = $14,542/mo. Egress = 5,000 × $0.08 ≈ $400. Total ≈ $15k/mo. Per useful GPU-hour (at 55% util) = $4.53.

Outcome: Solid floor for an inference cluster. Moving to AWS on-demand would push this to ~$50k/mo for the same silicon.

Example

Training run on hyperscaler with 1yr reserved

Scenario: 16 H100s at $4.50/hr (1yr reserved AWS), 88% utilization, 80TB egress for evals.

Math: Compute = 16 × $4.50 × 730h = $52,560/mo. Egress = 80,000 × $0.05 (committed tier) = $4,000. Total = $56,560/mo, $678k/yr.

Outcome: Acceptable if this is a production training pipeline running 70%+ of the year. If it's two big runs per year, switch to on-demand or spot — reservations punish part-time use.

Common mistakes

Where this calculation usually goes wrong in the real world.

Comparing GPUs on $/hour without normalizing for throughput. An H100 doing 2.5x the tokens/sec of an A100 at 1.6x the price is cheaper per useful unit of work.
Forgetting egress. AWS egress at $0.09/GB can add $9k/month to a 100TB pipeline — sometimes more than compute on training jobs.
Over-reserving. Reservation breakeven is usually 60–70% sustained utilization across the term. Bursty workloads lose money on long commits.
Counting on spot without checkpointing. A 72-hour training run on spot will get evicted at least once; without checkpoints you restart from scratch.
Ignoring storage and networking line items. Premium IOPS for dataset access and InfiniBand for multi-node training often add 10–20% to compute.

When to use this calculator

Choosing between hyperscaler reserved, neocloud on-demand, and bare metal for a new training pipeline.
Building a make-vs-buy case for self-hosted open-source models vs. hosted LLM APIs.
Negotiating committed-use discounts — model breakeven before the sales call.
Sizing a Series A budget request for the AI compute line item.
Comparing serving cost across H100, A100, L40S, and H200 for the same model.

Glossary

Term

On-demand

Pay-as-you-go GPU pricing with no commitment. Highest hourly rate but full flexibility.

Term

Reserved instance

1–3 year commitment to a specific GPU type for 30–65% off on-demand. Charged whether you use it or not.

Term

Spot / preemptible

Deeply discounted GPUs (60–90% off) that the provider can reclaim with 30s–2min notice. Great for fault-tolerant batch work.

Term

Neocloud

GPU-specialist providers (Lambda, CoreWeave, Crusoe, Together) that undercut hyperscalers by 40–70% on H100/H200 hourly rates.

Term

Egress

Data transferred out of the cloud. Charged per GB; varies 5–20x across providers and destinations.

Related guides

Long-form playbooks on the same topic, written by the RevenueLab editorial team.

Guide · 11 min read

LLM Token Costs in 2026: Pricing Every Model, Hidden Multipliers, and Margin Math

Input vs output token pricing across GPT, Claude, and Gemini, the context-window cost trap, how caching and batching cut bills 40–80%, and the real per-user margin most AI apps miss.

Read the guide

Methodology last reviewed: 2026-05 by the RevenueLab editorial team.

FAQ

How much does an H100 actually cost per hour?

On-demand: $4–$5/hr on AWS/GCP/Azure, $2.50–$3.50/hr on CoreWeave/Lambda/RunPod, $1.90–$2.50/hr on lesser-known neoclouds. Reserved 1-year drops hyperscaler pricing to ~$2.50–$3/hr; 3-year to ~$1.80–$2.20/hr.

Should I train on the cloud or buy GPUs?

Buying becomes cheaper than cloud reserved pricing at roughly 18–30 months of continuous utilization, depending on GPU and electricity costs. Most teams stay on cloud because they need elasticity, don't want to manage hardware, and lack the colo footprint.

How big is egress on real workloads?

On a hyperscaler, 10 TB/month of egress at $0.09/GB = $900/month. Cross-region data transfer (e.g., training in us-east-1, serving in eu-west-1) can dwarf the GPU bill on data-heavy pipelines. Cloudflare R2, Backblaze B2, and several neoclouds offer free or near-free egress as a deliberate counter to hyperscaler pricing.

What's a healthy GPU utilization target?

Inference: 60–80% steady-state utilization is excellent; under 30% means you're over-provisioned. Training: 90%+ wall-clock during the run, with auto-shutdown when the job completes. Always tag instances and set hard auto-stop policies — orphaned GPUs are the #1 cause of mystery cloud bills.

How this calculator is built

Independently maintained

Written by Sam Doshi and the RevenueLab editorial team. We don't sell the data feeds this tool is built on.

Sourced from primary data

Benchmarks come from public AdSense / Stripe / IRS disclosures and reader-submitted data — never third-party "$X per view" claims. Full methodology.

Last reviewed

July 2026. We re-check every figure on the platform on a rolling quarterly cycle.

Editorial standards

See our editorial policy and disclaimer. Results are estimates, not advice.

Stay in the same topic — keep the model running.

AI / GPU Cloud Cost Calculator

GPU cloud cost formula

Cite this calculator

On-demand vs reserved vs spot

Why neoclouds keep winning workloads

What each input means

GPU count

Hourly price per GPU

Utilization %

Reserved discount %

Egress GB/month

Worked examples

Inference cluster, 8× H100 on a neocloud

Training run on hyperscaler with 1yr reserved

Common mistakes

When to use this calculator

Glossary

On-demand

Reserved instance

Spot / preemptible

Neocloud

Egress

More questions answered

Are neoclouds actually safe to run production on?

How do I decide between H100 and A100 for serving?

What about TPUs and AMD MI300X?

How much should AI infra be of total cloud spend at scale?

Related guides

LLM Token Costs in 2026: Pricing Every Model, Hidden Multipliers, and Margin Math

FAQ

How much does an H100 actually cost per hour?

Should I train on the cloud or buy GPUs?

How big is egress on real workloads?

What's a healthy GPU utilization target?

How this calculator is built

AI / GPU Cloud Cost Calculator

GPU cloud cost formula

Embed this calculator

Cite this calculator

On-demand vs reserved vs spot

Why neoclouds keep winning workloads

What each input means

GPU count

Hourly price per GPU

Utilization %

Reserved discount %

Egress GB/month

Worked examples

Inference cluster, 8× H100 on a neocloud

Training run on hyperscaler with 1yr reserved

Common mistakes

When to use this calculator

Glossary

On-demand

Reserved instance

Spot / preemptible

Neocloud

Egress

More questions answered

Are neoclouds actually safe to run production on?

How do I decide between H100 and A100 for serving?

What about TPUs and AMD MI300X?

How much should AI infra be of total cloud spend at scale?

Related guides

LLM Token Costs in 2026: Pricing Every Model, Hidden Multipliers, and Margin Math

FAQ

How much does an H100 actually cost per hour?

Should I train on the cloud or buy GPUs?

How big is egress on real workloads?

What's a healthy GPU utilization target?

How this calculator is built

Related calculators