Linux & DevOps

Building Developer Teams with AI Agents: The Squad Approach

Posted by u/296626 Stack · 2026-05-03 11:09:24

In recent months, the landscape of AI-assisted software development has shifted dramatically. From initially worthless pull requests to suddenly finding real bugs, tools like Anthropic's Claude Mythos have exposed critical vulnerabilities in both open-source and proprietary projects. But while large teams can handle the influx of fixes, smaller projects—often run by a few volunteers—struggle to keep up. This productivity gap has sparked interest in agent harnesses that can orchestrate specialized AI agents to work alongside human developers. Enter Squad, an open-source project from Microsoft that creates a team of coding agents around GitHub Copilot. Below, we explore the key questions and answers about this emerging paradigm.

What changed in AI-generated code reports to make them suddenly useful?

Linux kernel maintainer Greg Kroah-Hartman noted a surprising shift: after over a year of worthless AI-based pull requests and security reports (often dismissed as 'slop'), those reports became genuinely useful in a short span. Kroah-Hartman initially attributed this to improved tools and a better understanding of how to use them. Subsequent events, such as the discovery of critical bugs by Anthropic's Claude Mythos, confirmed that AI models had become more accurate and effective at identifying real vulnerabilities. The difference likely stems from advances in large language model (LLM) prompting techniques, better grounding in code structure, and more rigorous training data, allowing AI to produce actionable results rather than noise.

Building Developer Teams with AI Agents: The Squad Approach — Source: www.infoworld.com

What is the current developer productivity crisis?

The rise in AI-driven security audits has led to a flood of newly discovered critical vulnerabilities in software projects. While large, well-funded teams can assign multiple developers to address these issues, smaller projects—often maintained by one or two people in their spare time—face an overwhelming workload. There simply aren't enough skilled developers to fix all the bugs within tight deadlines. This creates a productivity crisis: code needs fixing immediately, but the human resources to do so are scarce. The situation mirrors the broader shortage of software engineers, compounded by the need to address technical debt and keep pace with AI red teams that can find flaws faster than ever.

Can AI agents help bridge the productivity gap?

Yes, but with careful implementation. Agent harnesses like OpenClaw provide frameworks for coordinating multiple AI agents to tackle different aspects of software development—from front-end coding to testing. These tools can act as force multipliers, enabling a small human team to manage the increased workload. However, general-purpose LLM-based agents still suffer from inaccuracies and hallucinations. The key is to ground agents in structured data (like code and APIs) and follow a defined methodology, such as spec-driven development. When properly configured, agents can handle repetitive tasks, generate patches, and even validate fixes, freeing human developers to focus on complex problem-solving. Squad exemplifies this approach by assembling a dedicated agent team around GitHub Copilot.

What is Squad and who created it?

Squad is an open-source project developed by Brady Gaster, Principal PM Architect in the CoreAI Apps and Agents team at Microsoft. It builds an agent harness around GitHub Copilot, orchestrating a team of specialized agents that work alongside human developers. Designed for easy installation—just a single CLI call—Squad creates agents with distinct roles: a developer lead, a front-end developer, a back-end developer, and a test engineer. Optionally, it can include a documentation writer. This structure mirrors a real-world software team, allowing developers to delegate tasks and collaborate with AI counterparts. Squad is currently available on GitHub and aims to address the productivity crisis by providing a ready-made AI team for any project.

How does Squad's agent team work in practice?

After installation, Squad sets up a system of agents that communicate and coordinate using the agent harness. The 'developer lead' agent plans tasks and delegates work to specialized agents (front-end, back-end, test engineer). Each agent uses GitHub Copilot's code generation capabilities within a disciplined workflow. For example, the test engineer agent writes unit tests for new features, while the back-end agent implements APIs. Agents can also review each other's output. Developers interact through natural language commands or by committing code, and Squad handles the orchestration—assigning tasks, merging results, and ensuring consistency. This approach leverages the structured nature of code to reduce hallucinations, as each agent operates within a defined context and set of rules.

What are the main benefits and limitations of using Squad?

Benefits:

Productivity boost: A small team can match the output of a larger one, addressing the developer shortage.
Rapid bug fixing: Agents can quickly generate patches for critical vulnerabilities found by AI red teams.
Technical debt reduction: Automated agents help clean up legacy code in parallel with new development.
Structured collaboration: Role-based agents mimic human teams, making integration smoother.

Limitations:

Token costs: Running multiple LLM agents can be expensive, though Squad optimizes usage.
Hallucinations: Despite grounding, agents may still produce incorrect code or logic.
Dependency on Copilot: Squad currently tightly integrates with GitHub Copilot, limiting flexibility.
Learning curve: Developers must adapt to working with AI teammates and trust their outputs.

Overall, Squad represents a promising step toward agent-augmented development, but human oversight remains essential.

How does Squad compare to other agent harnesses like OpenClaw?

While general-purpose agent harnesses like OpenClaw can orchestrate various AI services across multiple domains, Squad is purpose-built for software development. It limits its scope to coding, testing, and documentation, which allows for deep integration with development tools (e.g., GitHub Copilot) and a tailored workflow. OpenClaw is more flexible but can be prohibitively expensive due to high token consumption across many services. Squad, by focusing on a specific domain and using Copilot’s code-completion API, reduces token costs and provides a more predictable experience. Additionally, Squad’s pre-defined team roles (lead, front-end, back-end, test) give it a structure that mirrors real teams, whereas OpenClaw requires custom configuration. The trade-off is that Squad is less adaptable for non-software tasks.

Share Save Report