About Platform Resources Articles Contact
AI Operations  ·  IMA AI

AI Gets Better.
Humans Still Sleep.
That Is Why I Moved My Team to API.

I was paying USD 65 per staff per month for three AI subscriptions — Claude, ChatGPT, Gemini. Every month. Whether they used it or not. Then came a month packed with public holidays, and I finally did the math I should have done earlier.

Published  June 2026
By  Chin Qi Yong, CEO — IMA AI
© 2026 Chin Qi Yong
Read time  ~5 min

The moment I started questioning it

Last month Malaysia had more public holidays than usual. Long weekends stacked. The team took leave. Projects paused between holiday breaks. It was a good month for rest.

Then the billing came through.

Same amount as every other month. Three AI subscriptions per staff. Claude Pro, ChatGPT Plus, Gemini Advanced. Flat monthly charge. No discount for the week of Wesak Day. No adjustment for the long weekends. No acknowledgement that the tools sat completely idle for several days. The platforms do not know your team is on holiday. They do not know your staff spent three days in client meetings and generated nothing. They charge the same whether your team produces at full capacity or produces nothing at all.

That was the moment the logic broke for me. I had been treating AI subscriptions the same way I think about office utilities — a fixed cost of operating. But office utilities make sense as a fixed cost because the space is there every day whether you use it or not. AI subscriptions are not infrastructure. They are production capacity. And production capacity should cost money when it produces, not when it exists.

The subscription math — what you are actually paying

When we started rolling out AI tools company-wide, the thinking was straightforward: give everyone access to the best tools, let them build the habit, and the productivity gains will follow. That logic is not wrong. But the cost arithmetic of how we did it was.

Per staff — monthly subscription cost
Claude ProUSD 25 / month
ChatGPT PlusUSD 25 / month
Gemini AdvancedUSD 15 / month
Total per staffUSD 65 / month

USD 65 per person per month. Multiply by headcount and that is a real number — one that grows the moment you hire someone new, and stays fixed the moment productivity dips, projects slow, or your team takes a long weekend.

The uncomfortable question is: what are you actually getting for USD 65? You are getting access to the tools — not generation. Not output. Not results. Access. The subscription pricing model is built on the assumption that you will use the tools consistently enough to justify the flat rate. It is priced for full utilisation.

The subscription model assumes your team is generating with AI tools during most of their working hours, most working days, all month. If they are not, you are subsidising the platform — not the other way around.

The human problem that subscriptions do not account for

AI models are improving every quarter. The capability bar keeps moving — faster generation, sharper reasoning, longer context, better output quality. The models do not slow down. They do not take breaks. They do not have off days.

Humans do.

One of my staff said it plainly: "AI is getting better, but human not. Human still need to eat and sleep."

That sentence broke the subscription logic for me. Subscriptions are priced against AI capability — essentially, what the tool can do at its maximum. But your team does not operate at the tool's maximum. They operate at human capacity, which includes lunch, sleep, leave, meetings, strategy sessions, client calls, administrative work, and days where the energy simply is not there for high-volume generation.

Realistically, a knowledge worker who uses AI tools as part of their workflow — not as their entire job — might be actively generating for two to three hours per working day. On a good day. Across a 22-day working month, that is roughly 44 to 66 hours of active generation. The subscription charges for 24/7 availability across all 30 days of the month.

What utilisation actually looks like
Potential daily availability24 hours
Realistic active generation / day2–3 hours
Working days / month~22 days
Days lost to holidays, leave, meetings3–5 days (minimum)
Estimated utilisation of subscription capacity~20–30%

At 25% utilisation, you are effectively paying four times the value you receive. The subscription bill does not shrink with your utilisation. The platform has no incentive for it to.

API: you only pay for what actually happens

The most common pushback I hear when I tell people we moved to API: "But API pricing is higher per token. Subscriptions are better value."

This is technically correct at 100% utilisation. At 20–30% utilisation, it is not. Not even close.

API pricing is consumption-based. The meter runs on tokens generated — input and output. When your staff is in a client meeting, the meter is silent. When the team takes a long weekend, the meter is silent. When a project is paused for two weeks between phases, the meter is silent. You pay for what happened, not for what could have happened.

Subscription pricing charges for the month regardless of what occurred in it. Every idle day is still a day you paid for. Every holiday is billed at the same rate as a full production day.

The real comparison
The question is not "which option has a lower per-token cost?" The question is: what is my team's actual generation volume, and what does that volume cost under each model? When I ran that calculation against our actual usage patterns — accounting for holidays, project cycles, and realistic generation hours — the API came out cheaper. The per-token rate is higher. The monthly bill is lower. Because the denominator changed.

Why we will never commit to an annual AI plan

Monthly subscriptions are expensive when underutilised. Annual plans are worse — they are expensive and they lock you in.

We learned this the hard way. When Seedance 2.0 launched and impressed the world with its video generation quality, we committed to annual subscriptions. Advance payment. Locked in. It felt like the smart move — annual pricing is almost always cheaper per month than monthly billing, and we were confident the tools would carry us through the year.

Then Alibaba Cloud introduced Happy Horse. Then Google introduced Omni. Both were significantly more powerful than what we had committed to. Better quality, more capability, better value for the generation tasks we actually needed. We were sitting on annual Seedance subscriptions with advance payment made, watching better options arrive in the market — and unable to move without writing off what we had already paid.

We were not stuck because we made a bad decision at the time. We were stuck because AI advanced faster than our contract allowed us to. That is the risk no one talks about when they pitch you the annual plan discount.

The pace of AI development right now is measured in months and quarters, not years. A model that leads its category today has, at best, a six-month window before something meaningfully better arrives. In video generation, image generation, language models — the gap between "best available" and "what just launched" closes faster than any annual contract can accommodate.

This is not pessimism about any specific model or platform. It is an observation about the rate of progress in the field. When the entire industry is shipping major capability jumps every quarter, a twelve-month financial commitment is a twelve-month bet that nothing better will come along. That bet has lost every time we have made it.

The lesson we drew: in a market advancing this fast, flexibility is worth more than the annual discount. Monthly billing — or better, consumption-based API pricing — keeps you free to move when the next meaningful jump arrives. And it will arrive. The question is only when.

We now have a hard internal rule: no annual prepayment for any AI tool, regardless of the discount offered. If the tool is worth using, we will pay the monthly rate. If something better arrives, we will switch. The cost of flexibility is a slightly higher per-month rate. The cost of being locked in is using a tool you know has been surpassed — and paying for it anyway.

The hidden cost that the billing dashboard does not show

Cost was the trigger. But there was a second problem I had been quietly absorbing without naming it.

Every AI subscription is a separate memory. Claude does not know what context your staff built in Gemini. ChatGPT does not know what brand guidelines you have been reinforcing in Claude. When a staff member switches tools — because one is better at long-form writing, another at structured research, another at code — they rebuild context from scratch every single time.

For a team trying to produce consistent output — consistent voice, consistent quality, consistent process — three disconnected subscriptions create three disconnected workflows. The tools are individually capable. Together, they create fragmentation. And fragmentation creates rework, inconsistency, and a ceiling on how good your team's AI-assisted output can ever be.

I was paying for three tools that did not talk to each other. That is not a workflow. That is three separate subscriptions hoping the human in the middle handles the integration manually.

What we did about it

We moved the team to API and consolidated access into a single internal interface. One place to reach all the models the team needs. One system where context is shared and consistent. One workflow that does not require staff to manage separate accounts, remember which tool is best for which task, or rebuild the same brand context repeatedly in three different platforms.

The immediate result: the monthly AI cost dropped, not because we gave the team less access, but because we stopped paying for access that was not being used. The team retained access to Claude, GPT, Gemini, and other models. The bill now reflects what they actually generate — not what they theoretically could generate if they worked every hour of every day.

But the cost saving was just the starting point. As we built the internal tool, the objective expanded. We realised the real opportunity was not just unified access to AI models. It was automation — a pipeline that could take a brief, generate a script, and continue into image and video production without requiring a human to manually carry the work from one tool to the next. That is what we are now building. I will write about that separately.

The lesson from the subscription-to-API move is simpler than the technical detail suggests. Subscriptions are a bet that your team will use the tools consistently enough to justify the flat rate. Most teams do not. Most teams have seasons, holidays, project cycles, and human limitations that create large stretches of idle time that the subscription bill does not acknowledge.

If you are running AI subscriptions for a team of any meaningful size, run the actual numbers. Not the headline price. The effective cost per prompt at your team's actual generation volume, accounting for the days and weeks when the tools sit unused. You may find you have been making a comfortable assumption that the math does not support.

AI capability is growing faster than human capacity to use it. Paying for maximum AI availability when your team operates at human availability is a structural mismatch. The subscription model makes that mismatch invisible. The API bill does not.
CQ
Chin Qi Yong
CEO, IMA AI
Chin Qi Yong is the CEO of IMA AI. IMA AI builds AI-powered infrastructure for commerce and content operations in Malaysia. Currently building Ultra Studio — an internal automation platform for script-to-content production.
Follow on LinkedIn

Published by IMA AI — June 2026.