Single-Agent Systems

>>> Build production-ready AI agents with simple loops, good tools, and minimal complexity.

The Best Agents Are Simple

After working with dozens of teams building LLM agents in production, a clear pattern has emerged: the best agents are simple loops, not complex frameworks.

Most successful single-agent systems:

What Works

• Avoid heavy abstractions

• Use direct LLM calls

• Rely on well-designed tools

• Add complexity only when it demonstrably improves outcomes

What a Single Agent Really Is

A single agent is not:

Multiple models talking to each other
A massive orchestration framework
A fully autonomous black box

A single agent is simply:

Definition

An LLM that can decide what to do next, call tools, observe results, and continue — in a loop.

Workflows vs Agents

Key Distinction

Type	Control Flow	Characteristics
Workflows	Predefined code paths	Predictable, fast, cheap
Agents	Model-directed	Flexible, adaptive, more expensive

A single agent sits at the minimal end of agents: one model, one loop, optional tools.

Start Simpler Than You Think

Before building an agent, always ask:

Before Reaching for Agents

• Can this be done with one LLM call + retrieval?

• Can prompt chaining solve it?

• Do I really need autonomy?

Agents trade latency + cost for flexibility. Use them only when:

The number of steps is unknown
The path cannot be hard-coded
The model must react to environment feedback

The Core Building Block: Augmented LLM

Every effective agent starts with an augmented LLM:

Augmented LLM Components

• LLM — The reasoning engine

• Tools — Search, code execution, database queries, file operations

• Memory — Short summaries, not raw conversation logs

• Environment feedback — Tool results, errors, state changes

You do not need a framework to build this. Tools connect your agent to external capabilities:

Related Guide →

MCP Servers

Learn how to connect AI agents to external tools and data using the Model Context Protocol.

The Minimal Agent Loop

This is the pattern most production agents reduce to:

The Canonical Pattern

1. Call the LLM with:
 - Current messages
 - Tool definitions
 - Short state summary

2. If the model requests a tool:
 - Execute it in code
 - Append the result

3. If the model outputs a final answer:
 - Stop

4. Enforce stop conditions:
 - Max steps reached
 - No progress detected
 - Budget exceeded

That's it. No magic.

When to Use What

Pattern Selection Guide

Pattern	Use When
Prompt chaining	Fixed steps, higher accuracy needed
Routing	Different inputs need different handling
Parallelization	Speed matters or need multiple perspectives
Evaluator-optimizer	Iterative refinement improves quality
Single agent	Steps unknown, tool use required

A single agent often internally performs workflows, but decides when.

Tool Design Matters More Than Prompts

Anthropic's biggest production insight:

Key Insight

More time should be spent designing tools than writing prompts.

Tool Design Rules

Design Principles

• Prefer simple formats — Avoid diffs, heavy JSON escaping

• Match natural formats — Use what models have seen in training

• Require absolute paths — Never rely on relative paths

• Include examples — Show expected inputs and outputs

• Set boundaries — Define what the tool can and cannot do

Think of tools as APIs for a junior engineer — clarity beats cleverness.

Minimal Single-Agent Example

This is intentionally barebones — no framework, no magic.

Python

Python Implementation

from openai import OpenAI

client = OpenAI()

TOOLS = [{
  "type": "function",
  "function": {
      "name": "search",
      "description": "Search the web for information",
      "parameters": {
          "type": "object",
          "properties": {
              "query": {"type": "string"}
          },
          "required": ["query"]
      }
  }
}]

messages = [
  {"role": "system", "content": "You are a helpful research agent."},
  {"role": "user", "content": "Find the main risks of self-hosting LLMs."}
]

MAX_STEPS = 5

for step in range(MAX_STEPS):
  response = client.chat.completions.create(
      model="gpt-4o",
      messages=messages,
      tools=TOOLS,
      tool_choice="auto"
  )

  msg = response.choices[0].message

  # Tool call
  if msg.tool_calls:
      tool_call = msg.tool_calls[0]
      if tool_call.function.name == "search":
          result = run_search(tool_call.function.arguments)
          
          messages.append(msg)
          messages.append({
              "role": "tool",
              "tool_call_id": tool_call.id,
              "content": result
          })
      continue

  # Final answer
  messages.append(msg)
  print(msg.content)
  break

TypeScript

TypeScript Implementation

import OpenAI from 'openai'

const client = new OpenAI()

const tools: OpenAI.ChatCompletionTool[] = [{
type: 'function',
function: {
  name: 'search',
  description: 'Search the web for information',
  parameters: {
    type: 'object',
    properties: {
      query: { type: 'string' }
    },
    required: ['query']
  }
}
}]

const messages: OpenAI.ChatCompletionMessageParam[] = [
{ role: 'system', content: 'You are a helpful research agent.' },
{ role: 'user', content: 'Find the main risks of self-hosting LLMs.' }
]

const MAX_STEPS = 5

for (let step = 0; step < MAX_STEPS; step++) {
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages,
  tools,
  tool_choice: 'auto'
})

const msg = response.choices[0].message

// Tool call
if (msg.tool_calls?.length) {
  const toolCall = msg.tool_calls[0]
  if (toolCall.function.name === 'search') {
    const result = await runSearch(toolCall.function.arguments)
    
    messages.push(msg)
    messages.push({
      role: 'tool',
      tool_call_id: toolCall.id,
      content: result
    })
  }
  continue
}

// Final answer
messages.push(msg)
console.log(msg.content)
break
}

This is a complete single agent. No frameworks. No abstractions. Fully debuggable.

Adding Reliability

1. Planning Step (Optional)

Ask the model to briefly outline its approach before acting:

Planning Prompt

messages = [
  {"role": "system", "content": """You are a research agent.
Before taking action, briefly outline your plan in 1-2 sentences.
Then execute step by step."""},
  {"role": "user", "content": user_query}
]

2. Verification Step

After the final answer, run a second LLM pass:

Verification

verification = client.chat.completions.create(
  model="gpt-4o",
  messages=[
      {"role": "system", "content": "You are a quality checker."},
      {"role": "user", "content": f"""
Original request: {user_query}

Agent response: {final_answer}

Does this fully satisfy the request? 
If not, explain what's missing."""}
  ]
)

3. Guardrails

Safety Measures

• Human approval for destructive tools (delete, send, publish)

• Hard limits on steps and token budget

• Sandbox execution for code and file operations

• Rate limiting to prevent runaway loops

When Single Agents Shine

Fit Assessment

Fit	Use Cases
Strong ✓	Research tasks, coding agents, customer support with tools, internal automation
Weak ✗	Strict compliance workflows, ultra-low latency systems, tasks with known fixed steps

Common Mistakes

Avoid These

• Over-engineering — Starting with frameworks before proving the simple version works

• Ignoring costs — Agents can burn tokens fast; always track usage

• Raw conversation memory — Summarize, don't append everything

• Vague tools — Unclear tool descriptions lead to misuse

• No stop conditions — Always have max steps and timeout

The Real Takeaway

Key Principle

The best agents aren't clever. They're boringly well-designed loops.

Start with:

One model
One loop
A few excellent tools

Only add orchestration, workers, or frameworks after you can prove the simple version isn't enough.

Summary

Building Single Agents

Start simple — One LLM, one loop, direct tool calls
Design tools carefully — This matters more than prompts
Add stop conditions — Max steps, budget limits, progress checks
Verify outputs — Optional second pass for quality
Scale complexity only when proven necessary

Happy building! 🚀

Single-Agent Systems

>>> Build production-ready AI agents with simple loops, good tools, and minimal complexity.

The Best Agents Are Simple

After working with dozens of teams building LLM agents in production, a clear pattern has emerged: the best agents are simple loops, not complex frameworks.

Most successful single-agent systems:

What Works

• Avoid heavy abstractions

• Use direct LLM calls

• Rely on well-designed tools

• Add complexity only when it demonstrably improves outcomes

What a Single Agent Really Is

A single agent is not:

Multiple models talking to each other
A massive orchestration framework
A fully autonomous black box

A single agent is simply:

Definition

An LLM that can decide what to do next, call tools, observe results, and continue — in a loop.

Workflows vs Agents

Key Distinction

Type	Control Flow	Characteristics
Workflows	Predefined code paths	Predictable, fast, cheap
Agents	Model-directed	Flexible, adaptive, more expensive

A single agent sits at the minimal end of agents: one model, one loop, optional tools.

Start Simpler Than You Think

Before building an agent, always ask:

Before Reaching for Agents

• Can this be done with one LLM call + retrieval?

• Can prompt chaining solve it?

• Do I really need autonomy?

Agents trade latency + cost for flexibility. Use them only when:

The number of steps is unknown
The path cannot be hard-coded
The model must react to environment feedback

The Core Building Block: Augmented LLM

Every effective agent starts with an augmented LLM:

Augmented LLM Components

• LLM — The reasoning engine

• Tools — Search, code execution, database queries, file operations

• Memory — Short summaries, not raw conversation logs

• Environment feedback — Tool results, errors, state changes

You do not need a framework to build this. Tools connect your agent to external capabilities:

Related Guide →

MCP Servers

Learn how to connect AI agents to external tools and data using the Model Context Protocol.

The Minimal Agent Loop

This is the pattern most production agents reduce to:

The Canonical Pattern

1. Call the LLM with:
 - Current messages
 - Tool definitions
 - Short state summary

2. If the model requests a tool:
 - Execute it in code
 - Append the result

3. If the model outputs a final answer:
 - Stop

4. Enforce stop conditions:
 - Max steps reached
 - No progress detected
 - Budget exceeded

That's it. No magic.

When to Use What

Pattern Selection Guide

Pattern	Use When
Prompt chaining	Fixed steps, higher accuracy needed
Routing	Different inputs need different handling
Parallelization	Speed matters or need multiple perspectives
Evaluator-optimizer	Iterative refinement improves quality
Single agent	Steps unknown, tool use required

A single agent often internally performs workflows, but decides when.

Tool Design Matters More Than Prompts

Anthropic's biggest production insight:

Key Insight

More time should be spent designing tools than writing prompts.

Tool Design Rules

Design Principles

• Prefer simple formats — Avoid diffs, heavy JSON escaping

• Match natural formats — Use what models have seen in training

• Require absolute paths — Never rely on relative paths

• Include examples — Show expected inputs and outputs

• Set boundaries — Define what the tool can and cannot do

Think of tools as APIs for a junior engineer — clarity beats cleverness.

Minimal Single-Agent Example

This is intentionally barebones — no framework, no magic.

Python

Python Implementation

from openai import OpenAI

client = OpenAI()

TOOLS = [{
  "type": "function",
  "function": {
      "name": "search",
      "description": "Search the web for information",
      "parameters": {
          "type": "object",
          "properties": {
              "query": {"type": "string"}
          },
          "required": ["query"]
      }
  }
}]

messages = [
  {"role": "system", "content": "You are a helpful research agent."},
  {"role": "user", "content": "Find the main risks of self-hosting LLMs."}
]

MAX_STEPS = 5

for step in range(MAX_STEPS):
  response = client.chat.completions.create(
      model="gpt-4o",
      messages=messages,
      tools=TOOLS,
      tool_choice="auto"
  )

  msg = response.choices[0].message

  # Tool call
  if msg.tool_calls:
      tool_call = msg.tool_calls[0]
      if tool_call.function.name == "search":
          result = run_search(tool_call.function.arguments)
          
          messages.append(msg)
          messages.append({
              "role": "tool",
              "tool_call_id": tool_call.id,
              "content": result
          })
      continue

  # Final answer
  messages.append(msg)
  print(msg.content)
  break

TypeScript

TypeScript Implementation

import OpenAI from 'openai'

const client = new OpenAI()

const tools: OpenAI.ChatCompletionTool[] = [{
type: 'function',
function: {
  name: 'search',
  description: 'Search the web for information',
  parameters: {
    type: 'object',
    properties: {
      query: { type: 'string' }
    },
    required: ['query']
  }
}
}]

const messages: OpenAI.ChatCompletionMessageParam[] = [
{ role: 'system', content: 'You are a helpful research agent.' },
{ role: 'user', content: 'Find the main risks of self-hosting LLMs.' }
]

const MAX_STEPS = 5

for (let step = 0; step < MAX_STEPS; step++) {
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages,
  tools,
  tool_choice: 'auto'
})

const msg = response.choices[0].message

// Tool call
if (msg.tool_calls?.length) {
  const toolCall = msg.tool_calls[0]
  if (toolCall.function.name === 'search') {
    const result = await runSearch(toolCall.function.arguments)
    
    messages.push(msg)
    messages.push({
      role: 'tool',
      tool_call_id: toolCall.id,
      content: result
    })
  }
  continue
}

// Final answer
messages.push(msg)
console.log(msg.content)
break
}

This is a complete single agent. No frameworks. No abstractions. Fully debuggable.

Adding Reliability

1. Planning Step (Optional)

Ask the model to briefly outline its approach before acting:

Planning Prompt

messages = [
  {"role": "system", "content": """You are a research agent.
Before taking action, briefly outline your plan in 1-2 sentences.
Then execute step by step."""},
  {"role": "user", "content": user_query}
]

2. Verification Step

After the final answer, run a second LLM pass:

Verification

verification = client.chat.completions.create(
  model="gpt-4o",
  messages=[
      {"role": "system", "content": "You are a quality checker."},
      {"role": "user", "content": f"""
Original request: {user_query}

Agent response: {final_answer}

Does this fully satisfy the request? 
If not, explain what's missing."""}
  ]
)

3. Guardrails

Safety Measures

• Human approval for destructive tools (delete, send, publish)

• Hard limits on steps and token budget

• Sandbox execution for code and file operations

• Rate limiting to prevent runaway loops

When Single Agents Shine

Fit Assessment

Fit	Use Cases
Strong ✓	Research tasks, coding agents, customer support with tools, internal automation
Weak ✗	Strict compliance workflows, ultra-low latency systems, tasks with known fixed steps

Common Mistakes

Avoid These

• Over-engineering — Starting with frameworks before proving the simple version works

• Ignoring costs — Agents can burn tokens fast; always track usage

• Raw conversation memory — Summarize, don't append everything

• Vague tools — Unclear tool descriptions lead to misuse

• No stop conditions — Always have max steps and timeout

The Real Takeaway

Key Principle

The best agents aren't clever. They're boringly well-designed loops.

Start with:

One model
One loop
A few excellent tools

Only add orchestration, workers, or frameworks after you can prove the simple version isn't enough.

Summary

Building Single Agents

Start simple — One LLM, one loop, direct tool calls
Design tools carefully — This matters more than prompts
Add stop conditions — Max steps, budget limits, progress checks
Verify outputs — Optional second pass for quality
Scale complexity only when proven necessary

Happy building! 🚀