Agency-agents hit 10.6K stars this week, which makes it one of the fastest-growing agent frameworks on GitHub right now. The pitch is bold: build an AI agency where agents collaborate like a team of humans.
I spent a few days actually using it. Here's what I found.
The premise
The idea is straightforward. Instead of one agent doing everything, you define a team. A researcher agent gathers information. A writer agent produces content. A reviewer agent checks quality. A project manager agent coordinates everyone.
Each agent has its own system prompt, tools, and memory. They communicate through a shared message bus. The framework handles orchestration, retries, and conflict resolution.
On paper, this sounds like exactly what everyone building agent systems wants.
What actually works
The role definition system is genuinely good. You define an agent's capabilities declaratively - what tools it has access to, what its expertise is, what it should and shouldn't do. The framework enforces these boundaries, so your researcher agent can't accidentally start writing code.
The communication system is clean. Agents can send messages to specific other agents or broadcast to the group. There's a built-in protocol for requesting help, delegating tasks, and reporting results.
Error handling is solid. If an agent fails, the framework can reassign its task to another agent with similar capabilities. This is the kind of thing that sounds obvious but most frameworks completely ignore.
What doesn't work (yet)
Cost management is basically nonexistent. A team of five agents, each making multiple LLM calls per task, burns through API credits fast. There's no built-in budget tracking, rate limiting, or cost optimization. In production, this is a dealbreaker unless you add your own layer.
The "AI agency" metaphor breaks down for simple tasks. If you just need to summarize a document, spinning up a team of agents is overkill. The framework doesn't have a good story for when to use a single agent versus a team.
Memory isolation between agents is too strict by default. In real teams, knowledge flows informally. Someone overhears a conversation and realizes they have relevant context. In agency-agents, agents only know what they're explicitly told. You can work around this, but the defaults fight you.
The honest comparison
How does agency-agents compare to other frameworks? Here's my take:
Against CrewAI: Agency-agents has better orchestration and more flexible communication patterns. CrewAI is simpler to get started with. If you need a quick prototype, CrewAI. If you need production coordination, agency-agents.
Against AutoGen: AutoGen is more research-oriented. Agency-agents is more practical. AutoGen gives you more control over the conversation dynamics. Agency-agents gives you more production-ready defaults.
Against OpenClaw: Different philosophy entirely. OpenClaw is built around a single capable agent with a skill system. Agency-agents is built around multiple specialized agents. The right choice depends on whether your problem is better solved by one smart agent or many coordinated ones.
Can you actually run an AI agency with it?
Depends on what you mean by "agency." Can you set up a team of agents that collaborates on content creation, code generation, or data analysis? Yes, and it works reasonably well.
Can you replace a team of human workers? No. And anyone telling you otherwise is selling something.
The realistic use case is augmentation. Your AI agency handles the routine, high-volume work - initial research, first drafts, data extraction, code scaffolding. Humans handle the judgment calls, creative direction, and quality assurance.
Should you use it?
If you're building a system where multiple agents genuinely need to collaborate (not just run in sequence), agency-agents is worth evaluating. The orchestration layer is mature, the communication system is flexible, and the community is active.
If your use case is "one agent that does several things," you don't need this. A single agent with good tools will be simpler, cheaper, and easier to debug.
The 10.6K stars aren't just hype. There's real substance here. But go in with realistic expectations about what "AI agency" actually means in practice.