The Open-Source AI Agent Stack
The open-source AI agent stack usually includes a model, an agent runtime, a tool or plugin layer, reusable skills, memory, and evaluation tools. If you only want a ranked list of frameworks, start with agents. If you are building a real workflow, use the stack map below.
Pick a starting stack by the workflow you need to ship.
These are practical starting points for search visitors who need a shortlist before reading the full stack map.
Best open-source coding agent stack
Start with OpenHands or SWE-agent, add repository skills, then evaluate changes with promptfoo or Langfuse traces.
Browse coding agentsBest browser automation stack
Start with browser-use or OpenClaw, keep human approval on risky actions, and test one website workflow before expanding scope.
Browse browser agentsBest local-first agent stack
Start with an open or local model, choose a local runtime, then add only the tools and memory layer your workflow actually needs.
Browse open modelsBest MCP and tool integration stack
Start with MCP SDKs or FastMCP, define permissions clearly, then add observability before connecting production systems.
Browse MCP toolsBest evaluation-heavy stack
Start with promptfoo, Ragas, or Langfuse, create task fixtures, and compare failures before changing models or frameworks.
Browse eval toolsAn agent stack is more than a model.
The model is only one layer. A useful agent system also needs a runtime that decides what to do, tools it can call, instructions it can reuse, memory it can control, and evaluation loops that catch failures before a workflow becomes expensive or risky.
Start with the job, then choose the layer.
Most agent mistakes come from starting with a popular framework instead of the workflow surface.
| What you need | Start with | Example projects |
|---|---|---|
| Automate websites or back-office workflows | Browser and action agents | OpenClaw, browser-use, OpenHands |
| Edit repositories and resolve code tasks | Coding agents | OpenHands, SWE-agent, Aider, Cline |
| Build multi-agent workflows | Agent frameworks | AutoGen, CrewAI, LangGraph, OpenAI Agents SDK |
| Run privately or locally | Models plus local runtimes | Qwen, DeepSeek, Gemma, GPT4All |
| Persist user or project context | Memory systems | mem0, Letta, Graphiti, Cognee |
| Connect files, APIs, tools, and services | Plugins and MCP | MCP SDKs, FastMCP, MCP Inspector |
| Measure reliability before adoption | Evaluation tools | promptfoo, Ragas, Langfuse |
The six layers to compare before adopting an agent stack.
Each layer answers a different adoption question: what powers the agent, what controls it, what it can do, what it remembers, what it connects to, and how you know it works.
Models
Choose the model family that matches your agent workload: tool calling, coding, reasoning, multimodal work, local inference, or hosted speed.
- Open weights or hosted API
- Context length and latency
- Tool-calling and coding behavior
- License and deployment constraints
Qwen3.6
Qwen's open model line focused on stronger coding, agentic tasks, and real-world stability.
DeepSeek-R1
Open reasoning model family for developers testing long-form reasoning, coding, and local AI workflows.
gpt4all
Run large language models locally on consumer hardware with a desktop application and Python library.
Agent frameworks and runtimes
Pick the runtime that controls how work gets planned, executed, supervised, and recovered when something goes wrong.
- Action surface: browser, code, CLI, workflow
- Human approval model
- Logs, replay, and sandbox boundaries
- MCP, API, and local runtime support
OpenHands
Open-source AI software development agent for coding tasks, repositories, and developer workflows.
AutoGen
Multi-agent AI framework from Microsoft Research for building conversational agent systems with AgentChat, Core API, and Extensions.
CrewAI
Multi-agent orchestration framework where role-playing autonomous AI agents collaborate to execute complex workflows.
Skills
Use skills to package repeatable procedures, team rules, evaluation habits, and task-specific agent behavior.
- Clear trigger conditions
- Inputs and outputs the agent can inspect
- Versionable instructions
- Evidence-gathering or verification steps
Memory systems
Add memory when agents need durable user facts, project context, workflow state, graph recall, or private knowledge retrieval.
- What gets stored
- How recall is explained
- Deletion and export controls
- Self-hosted or managed deployment
Plugins and MCP
Connect agents to external systems through protocol-based tools, MCP servers, SDKs, and platform-specific connectors.
- Protocol portability
- Authentication and permission model
- Available client/server SDKs
- Failure handling and observability
Evaluation and operations tools
Test, trace, benchmark, and operate agent systems before giving them wider access to production workflows.
- Prompt and agent eval support
- Trace and session inspection
- Regression testing
- Dataset and benchmark workflows
Common open-source agent stacks by use case.
These are starting paths, not fixed recipes. The right stack depends on workflow risk, deployment constraints, and how much supervision the agent needs.
Coding agent stack
Start with a coding agent, add repository-specific skills, then evaluate with regression tasks before wider use.
Browser automation stack
Start with the web surface, choose an approval boundary, then test one narrow workflow with logs and rollback.
Local-first agent stack
Choose a local model or open-weight model first, then add a local runtime, memory, and only the tools the agent truly needs.
Evaluation-heavy agent stack
Treat the agent like software: create task fixtures, trace every run, and compare failures before changing models or tools.
Go deeper with comparisons and evaluations.
A practical comparison of OpenHands (code-first autonomous agent), AutoGen (multi-agent conversations), and CrewAI (role-based team orchestration).
2026-06-02 MCP Tools and SDKs for Agent BuildersA focused guide to MCP Inspector, FastMCP, and the official Model Context Protocol Python and TypeScript SDKs.
2026-06-02 Agent Evaluation Stack: promptfoo, Ragas, and LangfuseHow to combine test suites, RAG quality checks, and production traces when evaluating AI agents.
2026-04-19 Best Open-Source Browser Agents for Workflow AutomationA practical guide to OpenClaw, browser-use, OpenHands, and Goose for builders who want agents that can move from chat to real actions.
Open-source AI agent stack FAQ.
What is an open-source AI agent stack?
An open-source AI agent stack is the set of components used to build an agent system: models, agent runtimes, tools, skills, memory, evaluation, and deployment surfaces.
What is the difference between an AI model and an AI agent?
A model predicts text, code, or actions from context. An agent wraps a model with planning, tool use, state, approvals, and execution logic.
Do AI agents need memory?
Agents need memory when they must preserve user preferences, project facts, task history, or knowledge across sessions. Simple one-shot workflows may not need a dedicated memory system.
Is MCP part of the AI agent stack?
MCP is best understood as an integration layer. It gives agents a more portable way to discover and use external tools, data sources, and services.
Which open-source agent framework should I start with?
Start from the workflow surface. Coding tasks point toward coding agents, browser tasks toward browser agents, and complex multi-step systems toward orchestration frameworks.