Your AI Agency Can't Tell You What a Lead Costs (2026)

Matt Payne · ·Updated ·7 min read
Key Takeaway

AI agencies hide $450-$1,400/month in run costs under their retainer: LLM tokens, enrichment, proxies, and retries. Demand a per-booked-meeting cost forecast in the SOW before signing. StoryPros puts this model in every contract.

Your AI Agency Can't Tell You What a Lead Costs

The Bill You Don't See Until Month Two

Here's how it usually goes. An AI agency quotes you $5,000 to build an outbound agent and $1,500/month to manage it. You sign. The agent goes live. Then the invoices from OpenAI, Clay, Apollo, and your proxy provider start hitting.

Nobody told you about those.

Anthropic just announced that starting June 15, 2026, automated agent workloads get a separate credit pool capped between $20 and $200/month per user. When credits run out, your agent stops. No queue. No fallback. Just dead. That's a structural change to how agent costs work, and most AI agencies haven't updated their pricing models to account for it.

Claude Opus 4.8 regular mode runs $5 per million input tokens and $25 per million output tokens. GPT-5.5 costs $5/$30. DeepSeek V4-Pro comes in at $0.435/$0.87. The model your agency picks can swing your monthly token bill by 30x. Nobody's putting that in the contract.

The AI agency cost per lead isn't what's on the invoice they send you. It's what's on the five invoices they don't.

The Real Bill of Materials

Here's what actually goes into running an AI sales agent for a month. These are real numbers from current vendor pricing pages.

LLM tokens. An outbound agent that researches a prospect, writes a personalized email, handles a reply, and books a meeting uses roughly 3,000–5,000 tokens per prospect. At Claude Opus 4.8 rates ($5 input / $25 output per million tokens), processing 1,000 prospects costs about $40–$75 in tokens. Use GPT-5.5 and it's closer to $50–$90. Use DeepSeek V4-Pro and it's under $5. Your agency probably doesn't tell you which model they're using. Or they use the expensive one because it's easier to prompt.

Enrichment. Clay charges per credit. Apollo charges per seat plus per-export. A typical enrichment waterfall — company data, contact info, technographics, intent signals — runs $0.15–$0.50 per record. At 2,000 leads/month, that's $300–$1,000 just to build your list.

Email verification. Every address needs verification before sending or you'll torch your domain. Services like ZeroBounce and NeverBounce charge $0.003–$0.008 per verification. Cheap per unit. But at 10,000 verifications/month, it's $30–$80.

Scraping and proxies. If your agent pulls LinkedIn data, company websites, or job postings, you're paying for residential proxies. That's $8–$15 per GB. An agent scraping 2,000 company sites per month burns 5–10 GB easily. That's $40–$150.

Retries and monitoring. API calls fail. Emails bounce. Webhook payloads get dropped. An n8n or Make workflow needs error handling, retry logic, and monitoring. Anthropic's Managed Agents charge $0.08/hour for execution time. An agent running 8 hours/day for 30 days costs $19.20 in infrastructure alone, before tokens.

Add it all up. A typical AI outbound agent running 2,000 leads/month carries variable costs of $450–$1,400/month on top of whatever your agency charges for build and management.

What a Run-Cost Model Should Look Like

Your SOW should include a table that looks something like this:

| Line Item | Unit | Unit Cost | Monthly Volume | Monthly Cost | |---|---|---|---|---| | LLM tokens (Claude Opus 4.8) | Per 1M tokens | $5 input / $25 output | ~8M tokens | $65 | | Lead enrichment (Clay) | Per credit | $0.25 avg | 2,000 | $500 | | Email verification | Per address | $0.005 | 4,000 | $20 | | Proxy/scraping | Per GB | $10 | 7 GB | $70 | | Workflow hosting (n8n cloud) | Per month | $50 | 1 | $50 | | Retries/error handling | ~15% of token cost | — | — | $10 | | Total variable run cost | | | | $715 | | Cost per lead processed | | | 2,000 leads | $0.36 | | Cost per qualified lead (20% qualification rate) | | | 400 leads | $1.79 | | Cost per meeting booked (5% of qualified) | | | 20 meetings | $35.75 |

That last number, $35.75 per booked meeting, is what actually matters. Not the build cost. Not the retainer. The cost of the output.

If your AI agency can't produce a table like this, they either don't know their own costs or they don't want you to.

Why This Matters More Now Than Six Months Ago

The pricing ground is shifting fast. In the last 30 days alone:

Anthropic split automated agent usage into a separate credit pool with hard limits. OpenAI launched a $100/month Pro plan with usage caps. GitHub moved Copilot to token-metered billing. DeepSeek made its 75% discount permanent, locking output at $0.87 per million tokens, while US labs moved prices in the opposite direction.

SaaStr just showed what's possible on the upside. Their GTM operation runs on 3 humans and 21 AI agents. It generated $2M in revenue and booked 614 meetings. Questex closed $1M in 90 days with two AI sales agents. Owner.com's BDR team produces $1.44M per rep per year.

Those results are real. But so are their costs. The winners know exactly what every meeting costs them.

A run-cost model is the difference between an AI agent that's profitable and one that's eating your margin from below the waterline.

What to Put in Your SOW

Demand these five things from any AI agency before you sign:

1. A per-unit cost forecast. Cost per lead enriched, per lead qualified, per meeting booked, per asset published. Not a range. A specific number based on stated assumptions.

2. Named models and stated token assumptions. Which LLM? What's the average token count per operation? What happens if the model vendor changes pricing? Anthropic has changed agent pricing three times this year already.

3. A cost cap or alert threshold. If variable costs exceed 120% of forecast, you should know immediately. Not at month-end.

4. Enrichment and verification line items. Every API they call on your behalf should be listed with its per-unit cost: Clay credits, Apollo exports, email verifications, proxy bandwidth.

5. A retry and failure rate assumption. API calls fail 5–15% of the time depending on the service. That means 5–15% more token spend. If they don't account for retries, their cost model is wrong from day one.

At StoryPros, we put all of this in the SOW. Not because we love spreadsheets. Because we've built 100+ AI automations and learned that the build is the easy part. The monthly bill is where trust breaks down.

Most AI agencies are engineers who connected some APIs. They don't think about unit economics because they've never run a P&L. We came from sales and marketing. We think about cost per meeting the way a CFO thinks about cost per acquisition. The tool matters less than the cost architecture around it.

FAQ

What is a good cost per lead from an AI agent?

StoryPros benchmarks a well-built AI outbound agent at $0.25–$0.50 per lead processed and $1.50–$3.00 per qualified lead, using mid-tier LLM pricing (Claude Opus 4.8 at $5/$25 per million tokens) and standard enrichment costs. If your AI agency cost per lead exceeds $5 for qualified leads, the model needs rework — either the enrichment stack is too expensive or the qualification rate is too low.

How much does it cost to run an AI sales agent per month?

Variable run costs for a typical AI sales agent processing 2,000 leads per month range from $450 to $1,400, covering LLM tokens, lead enrichment, email verification, scraping proxies, and workflow hosting. This doesn't include the agency's build fee or management retainer. The biggest variable is enrichment — Clay credits alone can run $300–$1,000/month depending on your waterfall depth.

What should an AI agency include in the SOW for cost transparency?

A proper SOW from an AI agency should include a run-cost model with per-unit cost forecasts (cost per lead, per meeting, per asset), named LLM models with token-count assumptions, enrichment API pricing per lookup, a retry/failure rate buffer of 10–15%, and a cost cap or alert threshold. If the SOW only lists a flat monthly retainer with no variable cost breakdown, you're signing a blank check.

How do LLM pricing changes affect AI agent run costs?

LLM pricing shifts directly impact your monthly bill. Anthropic's June 15, 2026 change moves automated agent usage to a separate credit pool capped at $20–$200/month per user — when it runs out, your agent stops working. DeepSeek V4-Pro at $0.435/$0.87 per million tokens costs roughly 30x less than Claude Opus 4.8 for output tokens. Your agency should specify which model they use and what happens if the vendor raises prices mid-contract.

How do you automate lead generation with AI while controlling costs?

Start with your unit economics target: how much can you spend per booked meeting and still be profitable? Work backward from there. Choose LLM models based on cost-per-output, not benchmarks. Use enrichment waterfalls that check cheap sources first (Apollo) before expensive ones (Clay). Verify every email address before sending. Build retry logic that caps at 3 attempts per API call. Owner.com generates $1.44M per BDR per year with this kind of structured approach — the tool matters less than the cost architecture around it.

AI Answer

How much do hidden run costs add up to for an AI sales agent per month?

Variable run costs for an AI sales agent processing 2,000 leads per month range from $450 to $1,400. This covers LLM tokens, lead enrichment, email verification, scraping proxies, and workflow hosting. The agency build fee and retainer are separate.

AI Answer

What should a cost per booked meeting look like from an AI outbound agent?

A properly modeled AI outbound agent should book meetings at roughly $35 per meeting, based on $0.36 per lead processed and a 5% booking rate from qualified leads. If your agency cannot give you a specific per-meeting cost before you sign, their cost model is incomplete.

AI Answer

What should an AI agency include in the SOW besides the build fee?

The SOW must include a run-cost table with per-unit costs for LLM tokens, enrichment, email verification, and proxies. It should name the specific LLM model used, state token-count assumptions, list a 10-15% retry buffer, and include a cost cap or alert if variable costs exceed 120% of forecast.