Engineering · 12 min read

How to Build a Multi-Turn AI Agent from Scratch

The architecture behind tools like Cursor and Claude Code is simpler than you think — a while loop, tools, and an LLM.

December 24, 2025 Engineering 12 min read

Team CloudHedge

Engineering

If you have used AI coding tools like Cursor, Windsurf, or Claude Code, you have probably wondered: how do they actually work? How does an LLM edit files, run commands, and iterate on its own output across multiple turns?

The answer is surprisingly straightforward. At their core, these tools are multi-turn AI agents — and the fundamental architecture can be built in under 400 lines of code. Understanding this pattern is essential for anyone building AI-powered automation, including the kind of enterprise modernization work we do at CloudHedge.

What Is an Agent?

An AI agent is not a chatbot. A chatbot takes your input, generates a response, and stops. An agent operates in a continuous loop — observing its environment, deciding what to do, executing actions, and repeating until the task is complete.

The agent loop in one sentence: observe the current state, decide what tool to use, execute that tool, feed the result back to the model, and repeat until the model says it is done.

This is sometimes called the observe-decide-execute pattern, and it is the same fundamental loop that powers everything from robotic process automation to autonomous vehicles. In the LLM context, the "observation" is the conversation history plus tool results, the "decision" is the model's next response, and the "execution" is running whatever tool the model selected.

The Architecture

A multi-turn agent has exactly four components:

A system prompt that tells the model who it is and what tools are available
A conversation history (array of messages) that grows each turn
A set of tools the model can invoke (functions with defined schemas)
A while loop that keeps calling the model until it produces a final response without tool calls

That is the entire architecture. No frameworks, no orchestration layers, no vector databases. Just a loop, a prompt, and tools.

The Agent Loop

Here is the core loop in pseudocode:

while True:
    response = call_llm(messages)

    if response.has_tool_calls:
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            messages.append(tool_result(result))
    else:
        # Model gave a final text response — we're done
        print(response.text)
        break

Every "multi-turn" interaction is just this loop running multiple iterations. The model calls a tool, gets the result appended to the conversation, and then decides whether to call another tool or deliver a final answer. The model itself controls the flow.

The 3 Core Tools

A code-editing agent needs surprisingly few tools to be effective. Three are sufficient to handle the vast majority of tasks:

1. Read File

The read_file tool takes a file path and returns its contents. This is how the agent inspects existing code, configuration files, or documentation. Without the ability to read, the agent is working blind.

2. Write File

The write_file tool takes a file path and content, then writes (or overwrites) the file. This is the agent's primary mechanism for making changes. Some implementations use a more granular "edit" tool that applies diffs, but a simple write is sufficient for a working agent.

3. Execute Command

The run_command tool executes a shell command and returns its output (stdout and stderr). This gives the agent the ability to run tests, install dependencies, check git status, compile code, and verify its own work. This is the most powerful tool because it connects the agent to the entire operating system.

Why three tools are enough: Read gives the agent eyes. Write gives it hands. Execute gives it the ability to verify and interact with the world. Together, they form a complete feedback loop.

System Prompt Design

The system prompt is where you define the agent's behavior, capabilities, and constraints. A well-designed system prompt includes:

Identity and role: What the agent is and what it specializes in
Available tools: JSON schemas describing each tool's parameters
Guidelines: Rules about when and how to use each tool
Constraints: What the agent should never do (e.g., delete production databases)
Output format: How the agent should structure its final responses

The quality of the system prompt directly determines the quality of the agent. A vague prompt produces a vague agent. A precise prompt that explicitly handles edge cases produces a reliable one.

Guardrails

Running arbitrary code is inherently dangerous. Production agents need guardrails:

Command allowlists: Only permit specific commands or command patterns
File path restrictions: Limit which directories the agent can read from or write to
Confirmation prompts: Require human approval for destructive operations
Timeout limits: Kill commands that run longer than expected
Sandboxing: Run the agent in a container or VM with limited permissions

At CloudHedge, CHAI's agent architecture applies these same principles at enterprise scale. When CHAI Flow decomposes a monolith into microservices, every transformation step is validated, tested, and reversible. The agents operate within strict guardrails defined by the organization's policies.

Conversation Memory and Context

The conversation history is the agent's working memory. Every tool call and result is appended to the message array, giving the model full context of what it has done and what happened. This is what makes agents "multi-turn" — the model can reference previous steps, learn from errors, and build on earlier work.

However, conversation history grows with each turn, and LLMs have finite context windows. Production agents need strategies to manage this:

Summarization: Periodically summarize older messages to compress history
Sliding window: Keep only the most recent N messages
Selective retention: Keep tool results that are still relevant, drop those that are not

Chatbot vs. Agent: A Comparison

Dimension	Chatbot	Agent
Interaction	Single request-response	Multi-turn loop
Tools	None (text only)	Read, Write, Execute, and more
Flow control	User drives every step	Model drives autonomously
Error handling	User must retry manually	Agent retries and self-corrects
Verification	None	Runs tests, checks output
Context	Single message	Full conversation + tool results
Complexity	Simple API call	Loop + tools + state management

Multi-Agent Systems

Once you have one agent working, the natural next step is orchestrating multiple agents. This is where things get interesting for enterprise use cases. A director agent can break a large task into subtasks and delegate each to a specialized worker agent.

This is exactly how CHAI works at scale. CHAI Universe acts as the discovery agent, mapping the entire application landscape. CHAI DART acts as the assessment agent, analyzing each application's architecture and dependencies. CHAI Flow acts as the execution agent, performing the actual modernization, containerization, and deployment. Each agent is specialized, and they coordinate through a shared understanding of the application portfolio.

Conclusion

The core architecture of a multi-turn AI agent is remarkably simple: a while loop that calls an LLM, executes tools, and repeats. The complexity comes not from the loop itself but from the system prompt design, the tool implementations, the guardrails, and the orchestration of multiple agents for larger tasks.

At CloudHedge, we have taken this pattern and applied it to one of the hardest problems in enterprise software: legacy application modernization. CHAI's agentic architecture — with specialized agents for discovery, assessment, and transformation — handles the kind of complex, multi-step work that previously required armies of consultants and years of effort.

Understanding how agents work is the first step toward building with them. The second step is putting them to work on problems that matter.

Ready to modernize your legacy?

See how CHAI transforms enterprise applications — autonomously, continuously, at scale.

Schedule a demo Explore CHAI

CHAI by CloudHedge — Blog — Agent View

/blog/multi-turn-ai-agent/

# How to Build a Multi-Turn AI Agent from Scratch

The architecture behind tools like Cursor and Claude Code is simpler than you think — a while loop, tools, and an LLM.

Author: Team CloudHedge
Date: December 24, 2025
Category: Engineering
Reading Time: 12 min

---

# How to Build a Multi-Turn AI Agent from Scratch

An engineering deep-dive into the architecture behind AI coding tools like Cursor, Windsurf, and Claude Code. The core pattern can be built in under 400 lines of code.

## Key Concepts

**The Agent Loop (observe-decide-execute):**
A while loop that calls an LLM, executes tool calls, appends results to conversation history, and repeats until the model produces a final text response with no tool calls.

**3 Core Tools:**
1. **Read File** — gives the agent eyes to inspect code and files
2. **Write File** — gives the agent hands to make changes
3. **Execute Command** — gives the agent access to the OS (run tests, install deps, check git)

Together these form a complete feedback loop sufficient for most coding tasks.

**System Prompt Design:**
Defines identity, available tools (JSON schemas), usage guidelines, constraints, and output format. Prompt quality directly determines agent quality.

**Guardrails for Production Agents:**
- Command allowlists
- File path restrictions
- Human confirmation for destructive operations
- Timeout limits
- Sandboxing (containers/VMs)

**Context Management:**
Conversation history is working memory. Strategies include summarization, sliding window, and selective retention to manage context window limits.

**Multi-Agent Systems:**
A director agent delegates subtasks to specialized workers. CloudHedge CHAI uses this pattern: Universe (discovery), DART (assessment), Flow (execution).

## CloudHedge Connection
CHAI applies the multi-turn agent pattern at enterprise scale for legacy application modernization. CHAI Flow's transformation agents operate within strict organizational guardrails, and the multi-agent architecture coordinates discovery, assessment, and modernization.

---

## Products
- CHAI Universe™ — AI-powered application discovery
- CHAI DART™ — Tri-Vector deep application assessment
- CHAI Flow™ — Agentic orchestration for automated modernization

## Contact
Schedule a demo: /contact/
Email: hello@cloudhedge.io