Agent Architecture
Understand how the agent loop works, how multi-agent systems decompose tasks, and why safety controls are non-negotiable in production.
The $40 loop that did nothing
An engineer shipped an agent without a step limit. The agent ran for 847 steps, burned $40 in API credits, and produced... nothing useful. It just kept searching, re-searching, and re-re-searching the same thing with slightly different queries. Nobody noticed for an hour because the logs looked busy.
One missing line of code. Forty dollars. Zero output.
This module teaches you how agents work, how to build them safely, and how to stop them from running away with your budget.
What is an agent?
You already know what an LLM does: you give it text, it gives you text back. A single turn. Ask → answer. Done.
An agent takes that further: it puts the LLM in a loop with tools. The LLM doesn't just answer — it thinks, takes an action (like searching the web or querying a database), observes the result, and then thinks again. Over and over until the task is done.
Think of the difference like this:
- Regular LLM = a student taking a closed-book exam. One shot. Whatever they know, they write down.
- Agent = a student doing a research project. They can Google things, check databases, run calculations, and revise their answer based on what they find.
The agent loop: Think → Act → Observe
Every agent follows the same three-step loop:
Agent Think-Act-Observe Loop
Think: The LLM looks at the task and everything it's observed so far, then decides what to do next. "I need to find the competitor's pricing page."
Act: The agent calls a tool — search the web, query a database, run code, send an email. The LLM doesn't execute anything directly; it asks the tool to do it.
Observe: The agent reads the tool's output. "The search returned 10 results about competitor pricing."
Then back to Think: "I got pricing for 3 competitors but I'm missing one. Let me search again." And the loop continues.
The critical question: When does it stop? Two ways:
- The LLM decides it has enough information and outputs a final answer.
- A safety limit kicks in (
max_steps, timeout, cost cap).
Without option 2, you get the $40 loop from the top of this page.
There Are No Dumb Questions
"How is this different from just calling the API multiple times in my code?"
When YOU write the loop, you decide the logic: "Call the API, parse the result, then call it again with these parameters." With an agent, the LLM decides the logic. It chooses WHICH tool to call, WHAT to pass to it, and WHETHER to continue. You give up control — which is powerful but dangerous.
"Can the agent do anything? Like delete files or send emails?"
Only if you give it tools that do those things. The agent can only use the tools you define. Giving it a "delete_file" tool is like giving a toddler scissors — technically possible, but you'd better be watching closely.
Trace the Agent Loop
25 XPSafety controls: the three non-negotiables
Without safety controls, an agent is a credit card with no spending limit held by someone who makes their own decisions. Here are the three guards you must always set:
1. Step budget (max_steps)
Cap the total number of Think-Act-Observe cycles. If the agent hasn't finished in 10 steps, something is wrong — kill it.
MAX_STEPS = 10
for step in range(MAX_STEPS):
thought = llm.think(task, observations)
if thought.is_done:
return thought.answer
result = tool.execute(thought.action)
observations.append(result)
# Graceful degradation: return a partial answer rather than crashing.
# The agent didn't finish, but it gathered observations — surface those
# instead of raising an exception that leaves the caller with nothing.
return AgentResult(
status="incomplete",
partial_output=llm.summarise_partial(task, observations),
message=f"Agent stopped after {MAX_STEPS} steps without reaching a conclusion.",
)
Two schools of thought: raise an exception (fail loud, easier to debug) vs. return partial output (fail graceful, better user experience). In production, do both — raise internally so logs catch it, but return the best partial answer to the caller so the user isn't left staring at an error page.
2. Timeout
Even within a step budget, a single tool call can hang forever. Set a timeout on every tool call.
3. Permission boundaries
Define exactly what the agent CAN and CANNOT do:
| Safe tools | Dangerous tools |
|---|---|
search_web (read-only) | delete_file (destructive) |
query_db (read-only) | send_email (irreversible) |
calculate (no side effects) | execute_code (could do anything) |
For dangerous tools, add a confirmation gate — the agent proposes the action, but a human approves it before execution.
There Are No Dumb Questions
"What about cost limits?"
Great idea. Track cumulative token spend and kill the agent if it exceeds a budget (e.g., $5). This catches the case where each individual step is fine but the total adds up.
Multi-agent systems: divide and conquer
Some tasks are too big for one agent. A multi-agent system splits the work across multiple specialised agents, coordinated by an orchestrator.
Think of it like a restaurant kitchen:
- The head chef (orchestrator) reads the order and assigns tasks
- The prep cook chops vegetables (one narrow job)
- The grill cook handles the steak (one narrow job)
- The pastry chef makes dessert (one narrow job)
The head chef doesn't chop vegetables — they coordinate. Each cook does one thing well.
Why this works better than one big agent:
- Testable: Each agent has one job — test it independently
- Replaceable: A failing agent swaps out without touching the others
- Debuggable: When something goes wrong, you know which agent broke
The dependency trap: Notice that the Writer Agent can't start until both Research and Data finish. One slow agent delays everything downstream. Always set timeouts on sub-agents and have fallback outputs so one delay doesn't freeze the pipeline.
Design a Multi-Agent System
50 XPTool call hallucination: the sneaky failure mode
Sometimes the agent invents a tool that doesn't exist ("I'll call analyze_sentiment()" — but you never defined that tool). Or it calls a real tool with fake arguments ("I'll search for file /data/report.csv" — that file doesn't exist).
This is tool call hallucination. The model is predicting what tool call is most likely, and sometimes the prediction is wrong.
The fix: Before executing ANY tool call, validate:
- Does this tool name exist in the registered tool list?
- Do the arguments match the tool's schema?
- Do the arguments refer to real resources (existing files, valid IDs)?
This validation happens at the system level, not in the prompt. A schema validator at the tool boundary catches hallucinated calls before they cause side effects.
Back to the $40 loop. After the incident, the team added three lines of code: a max_steps=20 limit, a 10-second per-tool timeout, and a logging alert that fires when any agent exceeds 15 steps. The agent that went on for 847 steps now fails fast at step 20 with an error log — and the engineer gets a Slack notification before $2 of credits are spent. Same agent. Three guardrails. Completely different failure mode.
Key takeaways
- An agent = LLM in a loop with tools. Think → Act → Observe → repeat. Powerful but dangerous without limits.
- Three non-negotiable safety controls:
max_steps(step budget), timeout (per tool call), permission boundaries (what can the agent touch?). - Multi-agent > one big agent for complex tasks. Each agent does one job. The orchestrator coordinates but never does real work.
- Tool call hallucination is real. Validate every tool call against the schema before execution.
Knowledge Check
1.An agent has been running for 45 steps. It calls search_web, gets results, then immediately calls search_web again with a slightly different query. This has happened 20 times. What is the most likely root cause?
2.An agent is given access to a file system tool. Which combination represents the minimum set of guardrails before deploying it to users?
3.In an orchestrator-worker multi-agent system, what must the orchestrator pass to a worker, and what should the worker return?
4.What is tool call hallucination, how does it typically manifest, and what system-level check catches it before the tool executes?