Building AI Products: Cost vs Client Expectations

I recently worked on a feature that reminded me how different engineering priorities can be from product priorities.

The project involves AI-powered matching.

A user submits a request, the system analyzes it, finds suitable matches, and then shows a preview before asking the user to pay to unlock the full report.

Simple enough on the surface.

The real challenge started when we began thinking about cost.

Every AI request costs money. And in products with previews, there’s always the possibility that users repeatedly test the system without ever converting into paying customers. If AI runs on every preview request, token usage can grow very quickly.

So my first approach was simple.

For the preview stage, I avoided AI completely.

Instead, I built a manual matching system that generated a compatibility score using predefined logic and rules. It was fast, predictable, and most importantly, cheap to run.

Once the user paid, that was when the AI kicked in. The system would generate the detailed analysis and full report using AI.

From an engineering perspective, this made perfect sense:

reduce unnecessary AI calls
protect operational costs
avoid wasting tokens on non-paying users
make scaling financially sustainable

But then another issue appeared.

The manual system generated its own score.
The AI-generated report also produced a score.

Sometimes they matched closely.
Sometimes they didn’t.

Technically, neither score was “wrong.” They were simply generated differently.

But from a user’s perspective, it created confusion.

A user could see one score before payment and another after payment.

That naturally raises questions:

Which score is correct?
Was the preview accurate?
Can the system be trusted?

So I adjusted the architecture.

I removed AI scoring entirely and standardized everything around the manual scoring system.

Now:

preview scoring used the manual system
final reports used the same manual score
AI only generated explanations and detailed analysis

To me, this solved both problems:

consistent user experience
controlled infrastructure cost

The user would see one score throughout the journey, while the business avoided burning tokens on users who never intended to pay.

But the client still didn’t agree.

They wanted the entire flow to be AI-driven from the beginning, including the unpaid preview.

Their argument was that if the product is marketed as AI-powered matching, then the preview itself should also come from AI. They didn’t want users seeing what they considered a “manual” result before payment.

And this is where I strongly disagreed.

Because from my perspective, this is exactly the kind of decision that quietly destroys margins in AI products.

The reality is that users abuse previews.

People test systems repeatedly.
People refresh requests.
People experiment without converting.
Some users are simply curious and never intend to pay.

In traditional software, that may not matter much.

But in AI products, every request has a direct operational cost attached to it.

That changes the equation completely.

One of the biggest engineering mistakes I think teams make right now is treating AI calls as if they are free. They are not.

If your business model depends on users paying to unlock value, then spending money before the user commits financially should be carefully controlled.

Especially at scale.

For me, the manual matching system was not “fake AI.”

It was an intentional engineering optimization:

cheaper
faster
more scalable
more predictable

And after standardizing the scoring system, the consistency issue had already been solved.

At that point, I felt the remaining concern was more about perception than actual product quality.

But at the end of the day, engineering decisions in client work are not made by engineers alone.

Part of professional software engineering is presenting tradeoffs clearly, explaining risks, and recommending what you believe is the strongest technical approach.

After that, the final business decision belongs to the client.

So eventually, we moved to a fully AI-driven flow:

AI generates the preview
AI generates the final report
AI runs before payment

Which means:

token costs increase
some AI calls are guaranteed to be wasted
non-paying users still consume resources

Do I think it was the optimal engineering decision?
No.

But I do think it was a valuable reminder that software development is rarely just about writing the best technical solution.

Sometimes you are balancing:

engineering principles
business perception
client expectations
operational cost
user trust

And those things do not always point in the same direction.

Building AI Products: Cost vs Client Expectations

Comments

More from this blog

My Moniepoint Frontend Interview Experience

A React Developer's Guide to Vue.js

Wrapping Up My Outreachy Internship at Mozilla

Setting up PerfCompare and Treeherder

Command Palette

Comments

More from this blog