Table of Contents
Quick Answer
AI agents use an LLM as a planner, tools (functions, APIs) to act, and memory for state. Use LangGraph or CrewAI for orchestration; keep tool sets small; always add human checkpoints for irreversible actions.
- Agents work best for multi-step tasks with clear success criteria
- Tool explosion (20+ tools) degrades performance — curate carefully
- Always sandbox shell/file tools in production
What You'll Need
- OpenAI-compatible model with function calling (Assisters, GPT, Claude)
- Orchestration framework: LangGraph, CrewAI, or custom loop
- Tools defined with JSON schema
- Observability: Langfuse or LangSmith
Steps
- Define the goal. Clear success criteria, not open-ended "help the user".
- Curate tools. Start with 3-5. Each needs a clear name, description, schema.
const tools = [
{ name: 'search_docs', description: 'Search internal docs by query', parameters: {...} },
{ name: 'send_email', description: 'Send an email (requires human approval)', parameters: {...} },
];
- Write the planner prompt. System: You are an agent. Think step by step. Call one tool at a time.
- Build the loop. LangGraph or a manual while-loop calling the LLM, parsing tool calls, executing, feeding results back.
- Add memory. Short-term: message history. Long-term: vector store of past interactions.
- Checkpoints. For email, payments, deletions — pause and ask user.
- Max iterations. Hard cap at 10-15 to prevent infinite loops.
- Observability. Log every step to Langfuse. Review failures weekly.
Common Mistakes
- Too many tools. Agents lose focus past 10 tools.
- No iteration limit. Runaway loops cost thousands.
- Ambiguous tool descriptions. AI picks wrong tool.
- Skipping human approval. One wrong email destroys trust.
Top Tools
Tool
Use
LangGraph
State machine for agents
CrewAI
Multi-agent collaboration
AutoGen
Conversational multi-agent
Langfuse
Agent observability
Pydantic
Tool schema
FAQs
Single agent or multi-agent? Single agent for < 5 steps. Multi-agent when tasks are specialized (researcher + writer + editor).
How expensive are agents? $0.10 - $2 per successful task depending on steps.
Can agents browse the web? Yes — Playwright MCP or Browserbase as a tool.
Do agents hallucinate tool calls? Sometimes — validate arguments with Zod before executing.
How do I debug agent failures? Replay from Langfuse traces step by step.
What about agent security? Sandbox file and shell tools. Never expose raw DB access.
Conclusion
Agents shift AI from chatbot to coworker in 2026. Start with LangGraph + 5 tools + Langfuse. Always checkpoint destructive actions. Misar Dev↗ builds agents from natural language spec.