Which agent framework should I start with?

For quick prototyping, LangGraph or CrewAI. For production with multiple agents, agency-agents or AutoGen. For self-hosted with full control, OpenClaw. For Elixir shops, Jido.

Do I even need a framework?

For a single agent doing simple tasks, no. A loop that calls an LLM API with tools is fine. Frameworks become valuable when you need memory, multi-step workflows, multi-agent coordination, or production reliability.

The Agent Framework Wars: Which One Actually Works? | OpenClaw Setup

I counted 47 agent frameworks on GitHub with more than 1,000 stars. Forty-seven. That's not an ecosystem - that's a land grab.

Most of these won't exist in a year. Some will merge. Some will pivot. Some will just stop getting updates. But right now, if you're trying to pick one, the options are overwhelming.

So let me save you some time. I've used or evaluated most of the major ones. Here's what actually works.

The tier list

Tier 1: Production-ready, actively maintained, real users

OpenClaw - Single-agent-with-skills model. Excellent for personal/business automation where one capable agent with the right tools beats a team of specialists. Strong skill system, good memory, solid MCP support.
LangGraph - The serious successor to LangChain. Graph-based workflow definition. Steep learning curve but very flexible. Good for complex multi-step workflows.
AutoGen (Microsoft) - Best for research and multi-agent conversations. The agent-to-agent dialogue system is genuinely novel. Overkill for simple tasks.

Tier 2: Good but with caveats

CrewAI - Easy to get started, good for prototyping. Struggles at scale. The "crew" metaphor is intuitive but the orchestration under the hood is simplistic.
Agency-agents - Strong multi-agent coordination. Still maturing. See my detailed review.
Qwen-Agent - Best if you're using Qwen models. MCP support is solid. Documentation is improving but still rough in places.

Tier 3: Interesting but niche

Jido - Elixir-based, fascinating architecture, tiny community. Perfect if you're already in the Elixir ecosystem.
Swarm (OpenAI) - Intentionally simple. Good for learning concepts. Not meant for production.

Tier 4: Avoid for now

Everything that launched in the last 3 months and has more GitHub stars than actual users. You know the ones. Great README, impressive demo, zero production deployments.

What actually matters in a framework

After using all of these, here's what separates the ones that work from the ones that don't:

Error recovery. What happens when an LLM call fails? When a tool throws an exception? When an agent gets stuck in a loop? Bad frameworks crash. Mediocre frameworks retry. Good frameworks have a strategy - fallbacks, checkpoints, graceful degradation.

Cost awareness. Every LLM call costs money. Frameworks that let agents make unlimited calls without any budget tracking will bankrupt you in production. Look for token counting, budget limits, and cost-per-task tracking.

Observability. Can you see what your agent is doing? What tools it's calling? What decisions it's making? If you can't trace an agent's execution path, you can't debug it. And you will need to debug it.

Tool integration model. How does the framework handle tools? Is it easy to add new ones? Can you use MCP servers? Are tools isolated from each other? The tool system is where you'll spend most of your development time.

Memory. Does the framework support persistent memory across sessions? Can the agent learn from past interactions? This is the difference between a disposable script and a useful assistant.

The dirty secret

Here's what nobody wants to admit: the framework matters less than you think. The hard parts of building agent systems are:

Defining what the agent should actually do
Building reliable tool integrations
Handling edge cases and failures
Managing costs

The framework handles maybe 20% of the work. The other 80% is your custom code, your prompts, your tools, and your operational infrastructure.

Pick a framework from Tier 1 or 2, commit to it, and spend your time on the hard stuff. Switching frameworks every month because the new one has better benchmarks is a waste of time.

My recommendation

If you're just starting: CrewAI for prototyping, then migrate to something more robust when you understand your requirements.

If you want production reliability with a single capable agent: OpenClaw.

If you need complex multi-agent workflows: LangGraph or agency-agents.

If you're in a research context: AutoGen.

If you're an Elixir dev: Jido, obviously.

Stop framework shopping. Start building.

The Agent Framework Wars: Which One Actually Works?

The tier list

What actually matters in a framework

The dirty secret

My recommendation

Related Reading

Get Your AI Agent Running