AI Agent Architecture Cheatsheet

Not every AI task needs an agent, and not every agent needs the same architecture. Choosing the right pattern determines whether your system is reliable and efficient or slow and unpredictable. This cheatsheet covers the five foundational agent architectures, with text diagrams showing the control flow, key components, and honest trade-offs for each. Use this as a quick reference when designing your next AI system.

ReAct Loop Foundational

The ReAct (Reasoning + Acting) pattern interleaves thinking and acting. The LLM reasons about what to do, takes an action (calls a tool), observes the result, then reasons again. This cycle repeats until the task is complete.

+------------------+ | User Query | +--------+---------+ | v +-----------------------------+ | LLM REASONING | +--->| "I need to search for..." | | +-------------+---------------+ | | | v | +-----------------------------+ | | ACTION (Tool Call) | | | search("query"), calc(42) | | +-------------+---------------+ | | | v | +-----------------------------+ | | OBSERVATION | | | Tool returns result... | | +-------------+---------------+ | | | +----------+----------+ | | Done? NO | YES +-------+ v +-------------------+ | Final Answer | +-------------------+

When to use: General-purpose tasks that require 1-5 tool calls, research tasks, question-answering with tool access, any task where the number of steps is not known in advance.

Key Components

LLM - Reasoning engine

Tools - APIs, search, code

Scratchpad - Running history

Stop Condition - Max iterations

Advantages

Simple to implement
Flexible -- handles diverse tasks
Transparent reasoning trace
Recovers from tool errors naturally

Limitations

Can get stuck in loops
No upfront planning
Greedy -- may miss better strategies
Cost grows with iterations

Plan-and-Execute Structured

Separates planning from execution. A planner LLM creates a high-level plan of steps, then an executor LLM (often cheaper) carries out each step. After each step, the planner can revise the remaining plan based on results.

When to use: Complex tasks with 5+ steps, research reports, data pipelines, any workflow where you want oversight and can define success criteria per step.

Key Components

Planner - Creates step list

Executor - Runs each step

Replanner - Adapts on failure

State Tracker - Step results

Advantages

Better for complex, multi-step tasks
Executor can use cheaper model
Plan provides a progress indicator
Replanning handles failures gracefully

Limitations

More complex to implement
Initial plan may be wrong
Higher latency (plan + execute)
Replanning adds cost

Multi-Agent Orchestration Advanced

Multiple specialized agents collaborate on a task. An orchestrator routes sub-tasks to the right agent. Each agent has its own system prompt, tools, and expertise. Agents may communicate with each other directly or through shared memory.

When to use: Tasks requiring diverse expertise (research + code + analysis), software development workflows, content production pipelines, any scenario where specialization outperforms generalization.

Key Components

Orchestrator - Task routing

Specialist Agents - Domain experts

Shared Memory - Knowledge store

Message Bus - Inter-agent comms

Advantages

Specialization improves quality
Parallel execution possible
Each agent has focused context
Scales to complex workflows

Limitations

High implementation complexity
Coordination overhead
Debugging across agents is hard
Expensive (multiple LLM calls)

Tool Use (Function Calling) Essential

The model selects and calls structured functions based on the user's request. Unlike ReAct, this can be a single-turn pattern: the model decides which tools to call (possibly in parallel), calls them, and synthesizes the results. Most modern APIs support this natively.

+------------------+ +---------------------------+ | User Query | | Available Tools: | +--------+---------+ | - search(query) | | | - get_weather(city) | v | - calculate(expression) | +------------------+ | - send_email(to, body) | | LLM decides |<----+---------------------------+ | which tool(s) | +----+--------+----+ | | v v (parallel tool calls) +--------+ +----------+ | Tool A | | Tool B | +---+----+ +----+-----+ | | v v +------------------+ | LLM synthesizes | | tool results | +--------+---------+ | v +------------------+ | Final Answer | +------------------+

When to use: API integration, data fetching, calculations, any task where the model needs to interact with external systems. This is a building block inside other patterns (ReAct uses tool use).

Key Components

Tool Definitions - JSON schemas

Tool Router - Parses LLM output

Execution Engine - Runs tools

Result Formatter - Returns to LLM

Advantages

Native API support (OpenAI, Claude)
Structured, typed inputs/outputs
Parallel execution support
Simple to add new tools

Limitations

Model may call wrong tool
Tool descriptions consume tokens
No built-in retry/recovery
Security risks (tool access control)

Memory Systems Cross-cutting

Memory is not an agent pattern itself but a critical capability that enhances every other pattern. Without memory, agents cannot learn from past interactions, maintain context across sessions, or build up knowledge over time. Memory systems typically layer three types.

+---------------------------------------------------------------+ | MEMORY ARCHITECTURE | +---------------------------------------------------------------+ | | | SHORT-TERM (Conversation Buffer) | | +----------------------------------------------------------+ | | | User: "Find me restaurants" | | | | Agent: "What cuisine? What area?" | | | | User: "Italian, downtown" | | | | [Last N messages, sliding window, always in context] | | | +----------------------------------------------------------+ | | | | WORKING MEMORY (Current Task State) | | +----------------------------------------------------------+ | | | Current goal: Find Italian restaurants downtown | | | | Steps completed: [searched Yelp, filtered by rating] | | | | Intermediate results: [3 candidates found] | | | | [Persists for duration of task, then archived] | | | +----------------------------------------------------------+ | | | | LONG-TERM (Persistent Knowledge) | | +----------------------------------------------------------+ | | | User preferences: prefers quiet restaurants, no seafood | | | | Past interactions: visited Luigi's (liked it) | | | | Learned patterns: user usually books for 2 people | | | | [Vector DB or structured store, retrieved by relevance] | | | +----------------------------------------------------------+ | | | +---------------------------------------------------------------+

When to use: Any agent that interacts with users over multiple sessions, personal assistants, customer support bots, research agents that build knowledge over time, and agents that need to learn from mistakes.

Key Components

Buffer - Recent messages

Vector Store - Semantic search

Entity Store - Structured facts

Summarizer - Compresses history

Advantages

Enables personalization
Agents learn and improve
Maintains context across sessions
Reduces repeated work

Limitations

Storage and retrieval costs
Stale or conflicting memories
Privacy and data retention concerns
Memory retrieval can be noisy

Pattern	Complexity	Best For	Typical Steps	Cost
ReAct	Low	General-purpose tasks	1-5	Low-Medium
Plan-and-Execute	Medium	Complex, multi-step tasks	5-20	Medium
Multi-Agent	High	Diverse expertise needed	10-50+	High
Tool Use	Low	API integration	1-3	Low
Memory	Medium	Persistent, learning agents	N/A (addon)	Low-Medium

GenAI Agents Tutorials → Agents Towards Production →

AI Agent Architecture Cheatsheet

ReAct Loop Foundational

Advantages

Limitations

Plan-and-Execute Structured

Advantages

Limitations

Multi-Agent Orchestration Advanced

Advantages

Limitations

Tool Use (Function Calling) Essential

Advantages

Limitations

Memory Systems Cross-cutting

Advantages

Limitations

Master AI Agent Architecture