DiamantAI Blog

AI Agent Architecture Cheatsheet

Visual reference guide to the five major agent patterns. Understand when to use each and how they work under the hood.

Not every AI task needs an agent, and not every agent needs the same architecture. Choosing the right pattern determines whether your system is reliable and efficient or slow and unpredictable. This cheatsheet covers the five foundational agent architectures, with text diagrams showing the control flow, key components, and honest trade-offs for each. Use this as a quick reference when designing your next AI system.

ReAct Loop Foundational

The ReAct (Reasoning + Acting) pattern interleaves thinking and acting. The LLM reasons about what to do, takes an action (calls a tool), observes the result, then reasons again. This cycle repeats until the task is complete.

+------------------+ | User Query | +--------+---------+ | v +-----------------------------+ | LLM REASONING | +--->| "I need to search for..." | | +-------------+---------------+ | | | v | +-----------------------------+ | | ACTION (Tool Call) | | | search("query"), calc(42) | | +-------------+---------------+ | | | v | +-----------------------------+ | | OBSERVATION | | | Tool returns result... | | +-------------+---------------+ | | | +----------+----------+ | | Done? NO | YES +-------+ v +-------------------+ | Final Answer | +-------------------+
When to use: General-purpose tasks that require 1-5 tool calls, research tasks, question-answering with tool access, any task where the number of steps is not known in advance.
LLM - Reasoning engine
Tools - APIs, search, code
Scratchpad - Running history
Stop Condition - Max iterations

Advantages

  • Simple to implement
  • Flexible -- handles diverse tasks
  • Transparent reasoning trace
  • Recovers from tool errors naturally

Limitations

  • Can get stuck in loops
  • No upfront planning
  • Greedy -- may miss better strategies
  • Cost grows with iterations

Plan-and-Execute Structured

Separates planning from execution. A planner LLM creates a high-level plan of steps, then an executor LLM (often cheaper) carries out each step. After each step, the planner can revise the remaining plan based on results.

+------------------+ | User Query | +--------+---------+ | v +------------------+ +---------------------------+ | PLANNER | | Plan: | | (Strong LLM) +------>| 1. Search for X | +--------+---------+ | 2. Extract data from Y | | | 3. Compare results | | | 4. Generate report | | +---------------------------+ v +------------------+ | EXECUTOR | Step 1 --> Result 1 | (Can be cheaper | Step 2 --> Result 2 | LLM) | Step 3 --> Result 3 +--------+---------+ | | (After each step, planner can REPLAN) v +------------------+ | REPLANNER | "Step 3 failed, adjusting..." | (Revise plan) | New Step 3 --> ... +--------+---------+ | v +------------------+ | Final Output | +------------------+
When to use: Complex tasks with 5+ steps, research reports, data pipelines, any workflow where you want oversight and can define success criteria per step.
Planner - Creates step list
Executor - Runs each step
Replanner - Adapts on failure
State Tracker - Step results

Advantages

  • Better for complex, multi-step tasks
  • Executor can use cheaper model
  • Plan provides a progress indicator
  • Replanning handles failures gracefully

Limitations

  • More complex to implement
  • Initial plan may be wrong
  • Higher latency (plan + execute)
  • Replanning adds cost

Multi-Agent Orchestration Advanced

Multiple specialized agents collaborate on a task. An orchestrator routes sub-tasks to the right agent. Each agent has its own system prompt, tools, and expertise. Agents may communicate with each other directly or through shared memory.

+------------------+ | User Query | +--------+---------+ | v +----------------------------+ | ORCHESTRATOR | | Routes tasks to agents | +---+--------+----------+----+ | | | +--------+ +---+---+ +--+--------+ v v v +----------------+ +------------+ +--------------+ | RESEARCHER | | CODER | | REVIEWER | | - web search | | - execute | | - analyze | | - read docs | | - debug | | - validate | | - summarize | | - test | | - critique | +-------+--------+ +-----+------+ +------+-------+ | | | +--------+--------+-------+-------+ | | v v +----------------+ +------------------+ | Shared Memory | | Final Output | +----------------+ +------------------+
When to use: Tasks requiring diverse expertise (research + code + analysis), software development workflows, content production pipelines, any scenario where specialization outperforms generalization.
Orchestrator - Task routing
Specialist Agents - Domain experts
Shared Memory - Knowledge store
Message Bus - Inter-agent comms

Advantages

  • Specialization improves quality
  • Parallel execution possible
  • Each agent has focused context
  • Scales to complex workflows

Limitations

  • High implementation complexity
  • Coordination overhead
  • Debugging across agents is hard
  • Expensive (multiple LLM calls)

Tool Use (Function Calling) Essential

The model selects and calls structured functions based on the user's request. Unlike ReAct, this can be a single-turn pattern: the model decides which tools to call (possibly in parallel), calls them, and synthesizes the results. Most modern APIs support this natively.

+------------------+ +---------------------------+ | User Query | | Available Tools: | +--------+---------+ | - search(query) | | | - get_weather(city) | v | - calculate(expression) | +------------------+ | - send_email(to, body) | | LLM decides |<----+---------------------------+ | which tool(s) | +----+--------+----+ | | v v (parallel tool calls) +--------+ +----------+ | Tool A | | Tool B | +---+----+ +----+-----+ | | v v +------------------+ | LLM synthesizes | | tool results | +--------+---------+ | v +------------------+ | Final Answer | +------------------+
When to use: API integration, data fetching, calculations, any task where the model needs to interact with external systems. This is a building block inside other patterns (ReAct uses tool use).
Tool Definitions - JSON schemas
Tool Router - Parses LLM output
Execution Engine - Runs tools
Result Formatter - Returns to LLM

Advantages

  • Native API support (OpenAI, Claude)
  • Structured, typed inputs/outputs
  • Parallel execution support
  • Simple to add new tools

Limitations

  • Model may call wrong tool
  • Tool descriptions consume tokens
  • No built-in retry/recovery
  • Security risks (tool access control)

Memory Systems Cross-cutting

Memory is not an agent pattern itself but a critical capability that enhances every other pattern. Without memory, agents cannot learn from past interactions, maintain context across sessions, or build up knowledge over time. Memory systems typically layer three types.

+---------------------------------------------------------------+ | MEMORY ARCHITECTURE | +---------------------------------------------------------------+ | | | SHORT-TERM (Conversation Buffer) | | +----------------------------------------------------------+ | | | User: "Find me restaurants" | | | | Agent: "What cuisine? What area?" | | | | User: "Italian, downtown" | | | | [Last N messages, sliding window, always in context] | | | +----------------------------------------------------------+ | | | | WORKING MEMORY (Current Task State) | | +----------------------------------------------------------+ | | | Current goal: Find Italian restaurants downtown | | | | Steps completed: [searched Yelp, filtered by rating] | | | | Intermediate results: [3 candidates found] | | | | [Persists for duration of task, then archived] | | | +----------------------------------------------------------+ | | | | LONG-TERM (Persistent Knowledge) | | +----------------------------------------------------------+ | | | User preferences: prefers quiet restaurants, no seafood | | | | Past interactions: visited Luigi's (liked it) | | | | Learned patterns: user usually books for 2 people | | | | [Vector DB or structured store, retrieved by relevance] | | | +----------------------------------------------------------+ | | | +---------------------------------------------------------------+
When to use: Any agent that interacts with users over multiple sessions, personal assistants, customer support bots, research agents that build knowledge over time, and agents that need to learn from mistakes.
Buffer - Recent messages
Vector Store - Semantic search
Entity Store - Structured facts
Summarizer - Compresses history

Advantages

  • Enables personalization
  • Agents learn and improve
  • Maintains context across sessions
  • Reduces repeated work

Limitations

  • Storage and retrieval costs
  • Stale or conflicting memories
  • Privacy and data retention concerns
  • Memory retrieval can be noisy
Pattern Complexity Best For Typical Steps Cost
ReAct Low General-purpose tasks 1-5 Low-Medium
Plan-and-Execute Medium Complex, multi-step tasks 5-20 Medium
Multi-Agent High Diverse expertise needed 10-50+ High
Tool Use Low API integration 1-3 Low
Memory Medium Persistent, learning agents N/A (addon) Low-Medium

GenAI Agents Tutorials → Agents Towards Production →

Master AI Agent Architecture

Get in-depth agent tutorials, architecture deep-dives, and production deployment guides delivered weekly. Join 25,000+ AI practitioners building the future.

Subscribe to the Newsletter