What AI integration actually costs a 50–500 person company.

Most cost breakdowns either lowball the model bill or hide the change-management drag. Here is the real ledger from twelve recent SMB engagements: where the money goes, where it disappears, and the three sinks that swallow most of the budget. We map it to four common project archetypes — RAG over docs, AI in CRM, AI underwriting, internal copilot — so you can self-locate before the first sales call.

There are two kinds of AI cost article on the open web. The first is written by a vendor and lowballs everything except the line item the vendor sells. The second is written by a tier-one consultancy and quotes ranges so wide they collapse into noise. Neither helps a 200-person company decide whether to spend €60k or €600k on its first serious AI integration.

What follows is the ledger we've actually seen across recent SMB engagements — what the line items are, where the surprises live, and which sinks swallow the budget. We're naming categories, not specific vendors, because the categories outlive the vendors.

The four archetypes

Every SMB AI integration we've shipped maps to one of four shapes. They have different cost profiles, different failure modes, and different break-even horizons. Knowing which archetype you're in is the first decision worth making.

RAG over docs. Retrieve over a corpus the team already owns; generate. Lowest floor cost; highest hidden retrieval-engineering cost. Typical band: €40-120k for production-grade.
AI in the CRM. Email drafts, lead scoring, call summaries, coaching. Higher integration tax (Salesforce/HubSpot adapters) but the data already exists and is mostly clean. €80-180k.
AI underwriting / scoring / decisioning. Output goes to a regulator, an examiner, or a counterparty. Eval and audit cost dominates. €150-400k.
Internal copilot. Slack/desktop assistant grounded in your wiki + tools. Usage-cost ceiling is real. €60-160k for the build, then ongoing inference.

Most SMB AI projects get priced on the model bill and lose the budget on the data layer.

The line items nobody tells you about

When a vendor shows you a cost slide, the line items are usually licences, model API spend, and a small bucket called 'integration'. The real cost has at least nine line items, and three of them swallow most of the budget.

1. Data preparation (the largest sink, every time)

In every engagement we have led, the data layer ate more time than every other workstream combined. Schema drift across regions. Identifiers that mean three different things in three different tables. Half-deduplicated records. PII redaction. The work is unglamorous, the team always thinks they have less of it than they actually do, and the AI integration cannot ship without it. Plan 35-45% of total budget here. If your vendor is quoting you 10%, ask them when they last shipped a system into a real warehouse.

2. Eval + observability

If the output goes to a customer, a regulator, or an internal decision-maker, you need a way to know it has not silently degraded. That means a golden dataset, regression tests, an observability stack, and someone who actually looks at the dashboards. Plan 15-20% here. If you skip it, your first prod incident is the one that surfaces the thing you should have built. We've watched teams discover a 4% drop in summary accuracy two weeks after it happened — every deal closed in those two weeks ran on the worse output.

3. Change management

The third sink is harder to put on a slide. The model can be perfect and the integration can be elegant and the team can still not adopt it because the workflow asks them to do something subtly wrong. Training, internal documentation, the slack channel where people share what is working, the human in the loop on the first 100 outputs. Budget 10-15%. The teams that skip this line are the same teams that complain six months later that "AI did not deliver value" — without noticing that nobody is using it.

The other six line items

Model + inference cost. 8-15% of budget for most archetypes; up to 30% for high-volume internal copilots.
Vendor licences (vector store, eval tooling, observability platform). 4-10%. Often the line vendors talk about most.
Integration adapters (your CRM, your data warehouse, your auth). 8-15%.
Security review + procurement. 3-8%. Underestimated systematically. Adds 3-6 weeks of calendar time.
Internal time (your engineers, your domain experts, your project lead). Usually unbudgeted. 15-25% of total cost in real hours.
Production hardening (prompt caching, cost guardrails, retry logic, fallback model routing). 5-10%.

Self-locating before the first sales call

A buyer who walks into a vendor pitch knowing their archetype, knowing the three sinks, and knowing the line items has a different conversation than one who does not. They can ask why the eval line is missing. They can ask which integration adapters are included and which are billed extra. They can ask what the data preparation assumption is — and notice when the answer is "your team will handle that".

Three questions worth getting answered before you sign anything: who owns the data layer, who owns the eval harness, and who is on the hook when output drifts in production. If two of three answers are "you" and the price still feels expensive, the price is probably right but the deliverable is not the one you thought you were buying.

A working envelope

For the median SMB AI integration we have shipped — somewhere between RAG and CRM-AI, with a 50-200 person company that has clean-enough data — the all-in cost lands at €120-220k for the first production system. Twelve to sixteen weeks. Two engineers, half a domain expert, and a project lead from the client. Ongoing inference costs of €600-2,000/month after that.

If your vendor is quoting you €40k for the same shape of work, the difference is not their efficiency — it is the line items they have left off the slide. Read the slide carefully. Ask the three questions. Then make the decision.

Or skip ahead and talk through it directly

The four archetypes

The line items nobody tells you about

1. Data preparation (the largest sink, every time)

2. Eval + observability

3. Change management

The other six line items

Self-locating before the first sales call

A working envelope

More from the same beat.

Self-hosted GPU vs hyperscaler: the break-even maths most CFOs get wrong.

The six patterns that stop an agent loop from burning $40,000 overnight.

Which Claude model should you use for code review? Sonnet 4.6 vs Opus 4.6 vs Haiku 4.5.

Want a custom brief for your industry?