AI · Unit economics · Free calculator

AI Agent Unit Economics Calculator

Calculate cost per task, gross margin, and break-even price for AI agents. Models multi-step token spend, retries, tool calls, infra overhead, and human-in-the-loop review cost.

Disclaimer: Cost-per-task estimates depend heavily on retry rate, tool pricing, and human review accuracy. Instrument your own production traces to derive real numbers before pricing decisions.

Scenarios
Common scenarios

Tap a persona to auto-load realistic numbers for that scenario, then tweak the sliders.

25,000

Multi-step agents commonly burn 20k–200k tokens including reasoning, tool args, and retries.

$8.00

Weighted avg of input + output across the models you call. Frontier ≈ $8–$20, Mini/Haiku ≈ $0.5–$2.

6

Search, scrapers, vector DB, third-party APIs called by the agent.

$0.02
12%

Hallucinations, tool failures, validation rejections — fraction of tasks that require a re-run.

2
$35.00
$0.02

Vector DB, observability, queues, hosting amortized per task.

$2.00
20,000
Formula used

AI agent unit economics

AI agents look cheap at the per-token level and expensive when you include retries, tools, and human-in-the-loop review. The retry multiplier is the most under-modeled line item.

Cost/task = (LLM + Tools + Human + Infra) × (1 + Retry%)
Healthy gross margin
60–80%
First-try success target
≥80%
Retry cost multiplier
1.05–1.5×
Backlink-friendly embed

Embed this calculator

Free to embed on any site. Inputs preserved, link back to RevenueLab. Each format trades polish for SEO juice.

<iframe src="https://revenuelab.fyi/embed/ai-agent-unit-economics-calculator?tokensPerTask=25000&blendedPricePerM=8&toolCallsPerTask=6&avgToolCost=0.02&retryRatePct=12&humanReviewMinutes=2&humanRate=35&infraOverheadPerTask=0.015&pricePerTask=2&tasksPerMonth=20000" width="100%" height="680" style="border:0;border-radius:12px;max-width:100%" loading="lazy" title="AI Agent Unit Economics Calculator"></iframe>
<p style="font:12px/1.4 system-ui;color:#666;margin:6px 0 0">Calculator by <a href="https://revenuelab.fyi/ai-agent-unit-economics-calculator?tokensPerTask=25000&blendedPricePerM=8&toolCallsPerTask=6&avgToolCost=0.02&retryRatePct=12&humanReviewMinutes=2&humanRate=35&infraOverheadPerTask=0.015&pricePerTask=2&tasksPerMonth=20000" target="_blank" rel="noopener">RevenueLab</a></p>

Easiest to install — passes referral traffic and a referring-domain signal.

Cite this calculator

Writing about this topic? Grab a citation — every link helps keep these tools free.

APA
RevenueLab. (2026). AI Agent Unit Economics Calculator. Retrieved from https://revenuelab.fyi/ai-agent-unit-economics-calculator
HTML
<p>Source: <a href="https://revenuelab.fyi/ai-agent-unit-economics-calculator" target="_blank" rel="noopener">AI Agent Unit Economics Calculator — RevenueLab</a> (2026).</p>
Markdown
Source: [AI Agent Unit Economics Calculator — RevenueLab](https://revenuelab.fyi/ai-agent-unit-economics-calculator) (2026).

Why per-task economics matter more than per-token

Token pricing is a vanity metric for agentic products. What customers buy is a completed task — a resolved ticket, a sourced lead, a closed PR. Token costs vary 10× across the same product depending on prompt design and how often the agent retries. Building the dashboard around $/successful-task (not $/token, not $/request) is the single most important unit-economics move an AI product team can make.

  • Cost per task = total LLM + tools + human + infra, divided by tasks delivered to customer.
  • Always include retries and failed-then-recovered runs in the cost numerator.
  • Segment by task type — a 'hard' task may cost 10× a 'normal' one and skew the blended number.

Pricing models that actually work for agents

Per-task pricing aligns cost and revenue but loses on price discrimination. Per-seat ignores variable cost entirely and risks runaway usage. Usage-based with caps (e.g., 500 tasks/month included, $X per overage) is the most common production pattern — it gives buyers a predictable bill while protecting your margin against power users. Outcome-based pricing (e.g., per qualified lead, per resolved ticket) commands the highest prices but requires very high first-try success rates to be safe.

Rex's Notes

Agent demos look magical and unit economics look horrifying. A single 'AI agent that handles a customer ticket' often makes 15 LLM calls, retries 3 times, calls 4 tools, and still needs a human to approve the final action. This calculator surfaces the fully-loaded cost per completed task — including retries, tool calls, and human-in-the-loop overhead — so you find out whether the agent makes money before you scale it.

What each input means

Get these inputs right and the output is reliable. Get them wrong and the calculator just multiplies bad assumptions.

LLM calls per task

Average completions used to finish one task end-to-end, including planning + tool use + reflection.

Typical range: 3–8 for simple agents; 10–30 for research/coding agents; 50+ for autonomous loops.

Avg cost per LLM call ($)

Blended input + output cost across the agent's calls. Reasoning models bias this much higher.

Typical range: $0.005–$0.05 for mini-tier loops; $0.05–$0.50 for frontier; $1+ for reasoning-heavy.

Retry rate %

Share of tasks that need full or partial re-runs due to tool failures, JSON parse errors, or quality misses.

Typical range: 10–25% in production; 40%+ in beta until you harden tool schemas.

Tool / API cost per task ($)

External services the agent hits — search APIs, scrapers, calendar, payment, email.

Typical range: $0–$0.10 for cheap APIs; $0.50–$5 for premium data or document processing.

Human review minutes per task

Time a human spends reviewing, correcting, or approving agent output.

Typical range: 0 for fully-autonomous; 1–5 min for supervised; 10+ min if quality is poor enough that the agent is net-negative.

Human hourly cost ($)

Fully-loaded cost of the reviewer — salary + benefits + management overhead.

Typical range: $25–$45 for offshore ops; $60–$120 for US ops; $150+ for specialists.

Retail price per task ($)

What the customer pays per completed task, or your imputed value if internal.

Typical range: $0.10–$2 for consumer; $5–$50 for B2B ops automation; $100+ for specialist agents.

Worked examples

Real scenarios with the math walked through line by line.

Example

B2B support agent, mostly autonomous

Scenario: 8 LLM calls × $0.015, 15% retry rate, $0.05 tool cost, 2 min human review at $50/hr, $3 retail price.

Math: LLM = 8 × $0.015 × 1.15 = $0.138. Tools = $0.05. Human = 2/60 × $50 = $1.67. Total cost = $1.86. Gross profit/task = $1.14 (38% margin).

Outcome: Margin works but thin. The human review minutes are 90% of cost — cutting review time from 2min to 30s would push margin to 70%.

Example

Research agent on reasoning models, supervised

Scenario: 20 LLM calls × $0.25, 20% retry, $1.50 in search/scraping, 8 min review at $90/hr, $35 retail price.

Math: LLM = 20 × $0.25 × 1.2 = $6.00. Tools = $1.50. Human = 8/60 × $90 = $12.00. Total = $19.50. Gross profit = $15.50 (44% margin).

Outcome: Defensible. But you need to either raise price toward $50 or get review time under 4 min before competition compresses margins.

Common mistakes

Where this calculation usually goes wrong in the real world.

  • Quoting cost per LLM call instead of cost per completed task. The two diverge by 5–20x once retries and tool use are included.
  • Pricing the human review at base salary instead of fully-loaded cost. Real cost is 1.4–1.8x base when you include benefits and management.
  • Assuming retry rate will drop fast. It usually doesn't — production agents settle at 15–25% retries indefinitely.
  • Ignoring eval and prompt-maintenance overhead. Agents need ongoing eng investment that should be amortized into cost-per-task.
  • Promising 'fully autonomous' to customers when your current review rate is 80%. Quote supervised pricing until you have data proving otherwise.

When to use this calculator

  • Pricing a new AI agent product before launch.
  • Deciding whether an internal automation project actually saves money vs. the team it replaces.
  • Comparing 'use agents' vs. 'hire offshore ops team' for the same workflow.
  • Justifying the engineering work needed to reduce retry rate or review time.
  • Building an enterprise pricing tier where per-task volume matters.

Glossary

Term

Cost per completed task (CPCT)

Total cost — LLM + tools + human review + retries — to finish one billable unit of agent work. The only metric that matters for agent unit economics.

Term

Retry rate

Share of tasks requiring re-runs due to tool failure, format error, or quality miss. Multiplies effective cost.

Term

Human-in-the-loop (HITL)

Workflow where a human approves or edits agent output before it ships. Adds cost but raises quality and customer trust.

Term

Tool call

Agent action that invokes an external function or API. Often the highest-variance cost line.

Term

Autonomy ratio

Share of tasks completed with no human intervention. Drives the difference between $20/hr ops cost and pure LLM cost.

More questions answered

What gross margin should an AI agent product target?

55–75% gross is the realistic band for first-generation agents with human-in-the-loop. Below 50% you can't fund support, sales, and product without subsidizing growth. Above 80% usually means you're either fully autonomous (rare and risky) or under-counting human review time. Pure SaaS-margin (85%+) agents almost don't exist yet outside of narrow classification workloads.

How do I bring retry rate down?

In rough order of impact: (1) tighten tool input schemas with Zod or JSON Schema validation; (2) add retry logic that re-prompts with the specific error rather than blind retry; (3) constrain output format with structured-output / function calling instead of free-text parsing; (4) add a cheap classifier ahead of the agent to filter requests it can't handle. Most production agents converge to 10–15% retries with this stack.

When should I charge per-task vs. per-seat?

Per-task when usage varies wildly across customers (research agents, document processing). Per-seat when usage is bounded by human throughput anyway (a sales rep can only review so many leads/day). Hybrid — base seat + per-task overage — is the most defensible model for B2B and the only way to protect margin when a customer suddenly 10x's volume.

Related guides

Long-form playbooks on the same topic, written by the RevenueLab editorial team.

Methodology last reviewed: 2026-05 by the RevenueLab editorial team.

FAQ

What gross margin should an AI agent product target?

60–80% is the sustainable range for an AI-native product. Below 50%, you have a thin-margin reseller business; above 85%, you're probably under-counting retries, human review, or eval engineering time.

How do I include human-in-the-loop cost?

Average review minutes × loaded hourly cost of the reviewer. Don't forget triage time on rejected outputs — a 2-minute review of a bad agent run usually triggers another full agent run.

Should I price per task, per seat, or per outcome?

Most early AI agents land on usage-based pricing with caps (per task with monthly inclusions, then overage). Seat pricing only works when usage is naturally capped (e.g., one ticket per support rep at a time). Outcome pricing is highest-ROI but requires very high reliability.

How fast can margins improve over time?

Model prices have dropped ~10× per year for equivalent capability since 2023. Combined with prompt optimization, caching, and model routing, mature AI agent products often improve gross margin by 15–25 percentage points within 12–18 months of launch — without raising prices.