Nyyon · Blog
The Gateways-Tools-Workflows Model That Kills Token Waste
June 21, 2026
The gateways-tools-workflows model confines expensive AI reasoning to where it is required and runs plain code everywhere else, so you build once and reuse.
Most AI agent workflow architecture wastes money because it asks a reasoning model to do work that ordinary code already does for free. The fix is a three-layer model: AI gets gateways to communicate with services, tools to act through those gateways, and workflows to chain tools toward an outcome. Reasoning, the expensive part, stays confined to the points where a real decision happens. Everything else is code you write once and reuse. That single structural choice is the difference between a workflow that scales and one that quietly drains your token budget every time it runs.

The dominant pattern, and why it bleeds tokens
The default agent build treats the model as the whole machine. The AI decides what to call, formats the request, parses the response, decides what to do next, and reasons its way through steps that never changed and never will. Every run pays full price for that reasoning, even when the path is identical to the last hundred runs.

This is the same laziness as defaulting to the biggest frontier model for every task. You give a deterministic job, fetch a record, send an email, log a result, to a system designed for open-ended reasoning. It works. It also costs you a multiple of what it should, and it gets slower and less predictable as the workflow grows.
The trap is that it feels productive. The agent looks autonomous. But autonomy on rails it doesn't need is not intelligence, it is overhead. You are paying for thinking in places where there is nothing left to think about.
The gateways-tools-workflows model, defined
The model has three layers, and the discipline is knowing which layer owns which job.
A gateway is a code connection to an external service. It is plain code, not AI. It knows how to talk to your email system, your dialler, your CRM, your data store. It does not reason. It connects.
A tool is a code function that does one thing through a gateway. Fetch a contact. Send a message. Update a status. A tool is not the AI writing a request and calling an API blind. It is a defined capability the AI already has inside the confines of a workflow.
A workflow is the chain that uses tools and gateways to reach an outcome. This is where reasoning lives, and only here. The model decides which tool to call next when there is a genuine decision to make. Where the next step is fixed, the workflow runs as code with no model in the loop.
The principle underneath all of it: reasoning is very expensive, so you confine it to where reasoning is actually required. Everywhere else, you write code. You build once, and you reuse. That is what kills the token waste.
How it worked in practice
I built a command centre for a business funding operation that had to orchestrate email outreach, an AI SDR, and dialler operations at the same time. A lot of the logic that ran that system would have been impossible to implement if the AI had to reason its way through every step.
Looked at through gateways, tools, and workflows, it became possible. The gateways connected to the email platform, the dialler, and the data layer. The tools were the discrete actions on top of them. The workflows decided the sequence: who gets a call, who gets an email, what happens when someone replies, what the SDR does next.
The model only reasoned at the decision points that mattered. The plumbing, the sending, the fetching, the logging, was code. Built once, reused on every run, at no marginal reasoning cost. That is the difference between a workflow you can afford to run continuously and one you can only afford to demo.
The same logic you already use for model routing
If you route hard tasks to big models and easy tasks to small models, you already understand this. Gateways-tools-workflows is that logic moved one level down, inside the architecture of a single workflow.
You don't send a simple classification to a frontier model. You don't make a senior executive write production code. And you don't make a reasoning model handle a database fetch. In each case you match the cost of the resource to the difficulty of the decision. Meaningful decisions go to the expensive layer. Execution and small decisions go to the cheap one.

The waste in most agent builds is exactly the failure to make that separation. They send everything to the most capable layer because it is easier to wire one big model than to design where reasoning belongs. Easier to build, expensive to run, and the bill compounds with every execution.
What changes when you adopt it
The first thing that changes is your token spend. When reasoning is confined to real decision points, you stop paying for thinking on the rails. A workflow that previously ran every step through the model now runs most steps as code, and the cost drops to the handful of genuine decisions per run.

The second thing that changes is reliability. Code is deterministic. A gateway that fetches a record does it the same way every time. When you push that work out of the model, you remove a class of failures where the AI improvises a request, misreads a response, or reasons its way into a path you did not intend.
The third thing is reuse. Once a gateway and its tools exist, every future workflow uses them without rebuilding. You build the email gateway once and every campaign, every SDR sequence, every lifecycle flow calls it. The marginal cost of the next workflow drops because the connective tissue already exists.
The trade-off you take on honestly
This is more upfront engineering than pointing a model at a goal and letting it run. You have to write the gateways. You have to define the tools. You have to map where decisions actually live and draw the line between reasoning and code. That is real work, and a guardrail-free agent skips it entirely.
That skipped work is exactly why the guardrail-free agent is unhealthy. Let a model loose to reach a goal however it wants, and it might get there, but it will spend half your budget along the way. The gateways-tools-workflows model front-loads the thinking so the running is cheap. You trade a harder build for a workflow you can afford to run forever.
There is also a discipline cost. The temptation is always to let the model handle one more step because it is faster to write. Every time you give in, you put reasoning back on a rail that did not need it, and the savings erode. The model is a principle you have to hold, not a switch you flip once.
Why this is the spine, not a feature
The reason to build this way is that it turns into reusable building blocks. A gateway is a unit. A tool is a unit. A workflow is an assembly of units. Once you have a library of them, new outcomes become combinations of things you already built rather than ground-up engineering.
That is the opposite of the current default, where each agent is a bespoke reasoning loop that pays full price on every run and shares nothing with the next one. Build the spine of gateways and tools, confine reasoning to the workflows that genuinely need it, and you stop renting intelligence for jobs that plain code does better and cheaper.