My experiments with AI

A software engineers journal

Choosing the Right AI Agent Framework — A 2025 Guide for Builders

🧭 1.The Framework Maze

(A data-driven comparison of adoption, ecosystem fit, and best use cases.)

In 2025, we’re spoiled for choice when it comes to agent frameworks. Every major AI vendor — AWS, Google, Microsoft, Anthropic, OpenAI — has its own SDK. Meanwhile, open-source ecosystems like LangGraph, CrewAI, and LlamaIndex continue to evolve faster than the clouds can catch up.

Two recent big releases have also reshaped the landscape:

Anthropic’s Claude Agent SDK (released September 2025) — focused on long-context reasoning, subagents, and safe orchestration.

OpenAI’s AgentKit (released October 3 2025) — a full-stack SDK for building, deploying, and evaluating AI agents with tool use, connectors, and guardrails baked in.

Together, these mark a turning point from “prompt-chaining” to production-grade agent platforms. OpenAI’s AgentKit bundles a visual workflow builder, connector registry, embedded chat UI, and built-in evaluation tools into one integrated stack. Claude’s SDK, built on the infrastructure of Claude Code, brings automatic context compaction, subagents, rich tool permissions, and session management.
With those superpowers entering the field, the big question is: which framework should you use, and when? This guide walks you through:

  • Adoption and usage data
  • Feature comparisons
  • Deep dives of each framework
  • A decision matrix based on your use case
  • Strategic advice on mixing frameworks
  • Let’s start with a snapshot of how these frameworks compare in popularity and ecosystem fit.


📊 2. Adoption Snapshot & Feature Comparison— What the Data Says (2025)

FrameworkStars / CommunityDownloads / ActivityEcosystem FitUsers & Use Cases
LangChain / LangGraphVery highTens of millions of installsCloud-agnosticStartups, AI teams, freelancers
LlamaIndexHighWidely used in RAG stacksVendor-neutral, data connectorsKnowledge agents, internal QA agents
Claude Agent SDKGrowing (GitHub repo)New releases via pip/npm GitHubStrong in Claude/Anthropic ecosystemRetrieval + long context agents
OpenAI AgentKitJust releasedEarly usage but strong buzz Tightly integrated with OpenAI / Responses APIRapid prototyping and production agents
AWS Strands SDKEarly stageLimited public metricsBest in AWS / Bedrock contextsEnterprise agent deployments on AWS
CrewAIModerateGrowing communityCloud-agnosticAgent teams, role delegation systems
AutoGen (Microsoft)Strong in OSSGrowing usageGood hybrid across cloudsConversational multi-agent flows
Semantic KernelMatureEnterprise usersMicrosoft / Azure stackCorporate assistants and plugins
HaystackEstablishedMany enterprise deploymentsOpen source, cloud-agnosticDocument Q&A, search + RAG
IBM Bee / Smol AgentsNicheLow to moderateTied to specific ecosystemsEnterprise orchestration / prototyping

Note: “Downloads / Activity” is approximate and reflects active community engagement, not always commercial usage. Use these as directional signals, not definitive proof.

Here’s a high-level breakdown of the most critical dimensions to compare across frameworks:

FeatureWhy It MattersWhat to Look For
Context & Memory ManagementAgents that forget or blow their context failSummarization, compaction, subagent splitting
Orchestration & Multi-AgentCoordinating agents or roles is core for complex agentsGraph flow engines, agent-to-agent calls
Tool / Connector SupportAgents must do work (APIs, DBs, file ops)Registry, permissioning, plugin support
Observability / Trace / EvalDebugging agents is hard without visibilityTrace logs, eval scoring, prompt optimization
Deployment & PortabilityYou may want to run on different cloudsContainer support, vendor-agnosticism, hybrid flow
Security & GovernanceAgents interacting with your systems must be safeGuardrails, permissions, least-privilege tools
Ecosystem / AdoptionFrameworks with communities offer more integrationsPlugins, templates, third-party tools

As you read each deep dive below, consider which features are strong, which are missing, and how that maps to what you need.


🧩 3. Framework Deep Dives — Features, Ecosystem Fit & Use Cases


LangGraph (LangChain)The Orchestrator Everyone Builds On

Overview:
LangGraph brings graph-based reasoning to LangChain, allowing developers to build stateful, multi-step, resilient agent workflows. It brings structure to LLM agents by modelling an agent’s logic as a state machine with persisted memory. Agents have short‑term memory (conversation state) that is automatically saved via a checkpointer so sessions can be resumed, and a long‑term memory store that persists user or application data across sessions. This makes LangGraph ideal for complex flows that need to recall prior context. It’s popular in the open‑source community (tens of thousands of stars) and is used across cloud environments

📈 Adoption: Most popular open agent framework by GitHub activity and ecosystem integration.
🎯 Target Users: AI engineers, startups, researchers.
☁️ Ecosystem Fit: Works across AWS, GCP, Azure, local, and open-source LLMs.
🧩 Best For: Complex orchestration, RAG pipelines, and multi-agent reasoning; developers who want fine‑grained control over agent state and cross‑platform portability (vendor‑neutral)
📌 Use Case Example:
Multi-agent legal research assistant (retriever → analyzer → summarizer).


AutoGen (Microsoft)Conversational Multi-Agent Collaboration with Human-in-the-Loop

Overview:
Microsoft’s AutoGen stands out for its elegant approach to orchestrating multi-agent conversations — not just between AI models, but between humans and AIs as peers in the same loop. Where most frameworks focus on single-agent autonomy, AutoGen is designed for collaboration and coordination. What sets AutoGen apart is its human-in-the-loop (HIL) design philosophy. Humans can join the conversation at any point, injecting feedback or context mid-session, creating a tightly coupled feedback cycle that enhances reliability and trust. This has made AutoGen the go-to framework for collaborative copilots and research assistants that balance AI autonomy with human oversight.
📈 Adoption: Rapid growth in the open-source community (~50K stars). Strong adoption among researchers and Azure AI teams.
🎯 Target Users: AI engineers and applied researchers building multi-agent copilots or human-supervised AI systems.
☁️ Ecosystem Fit: Best on Azure AI and OpenAI APIs, but extensible to local or hybrid environments.


CrewAIAutonomous Role-Based Teams

Overview:
CrewAI simplifies multi-agent collaboration using role definitions and auto task delegation. CrewAI is a Python framework designed for autonomous crews of agents. It’s independent of LangChain yet offers high‑level simplicity with low‑level control. Crews define role‑based agents (researcher, analyst, writer, etc.) with flexible tool access and intelligent collaboration; agents share insights and coordinate tasks. It introduces Flows – event‑driven orchestrations allowing fine‑grained control over execution and native crew integration. With over 100k developers enrolled in its community courses, CrewAI is gaining traction in the startup and enterprise automation space.
📈 Adoption: Rapid community growth; many LangChain + CrewAI hybrids.
🎯 Target Users: Indie developers, automation researchers.
☁️ Ecosystem Fit: Vendor-neutral, runs anywhere Python does.
🧩 Best For: Teams wanting role‑based multi‑agent systems and flows without committing to a specific model provider (vendor‑neutral).
📌 Use Case: Automated news summarization where “Researcher,” “Writer,” and “Editor” coordinate asynchronously.


Google ADKEnterprise Multi-Agent on Vertex AI

Overview:
Google’s Agent Development Kit (ADK) brings multi-agent orchestration and role-based planning tightly integrated with Vertex AI and Gemini models. Google ADK is a flexible, modular framework that aims to make agent development feel like software engineering. It’s optimised for Google’s Gemini models but is both model‑ and deployment‑agnostic. ADK supports sequential, parallel and loop workflow agents for deterministic pipelines as well as dynamic LLM‑driven routing. Because it integrates natively with Vertex AI and BigQuery, ADK suits Google Cloud users.
📈 Adoption: Limited to enterprise beta; expected growth via GCP.
🎯 Target Users: GCP-native enterprise developers.
☁️ Ecosystem Fit: Tied to Google ecosystem; deep Vertex integration.
🧩 Best For: Enterprises already invested in GCP who need robust orchestration and integrated code execution.
📌 Use Case: Cloud optimization agent that autonomously manages GCP workloads.


OpenAI AgentKitProduction-Grade Agent SDK for Builders

Overview:
AgentKit is a lightweight SDK for deploying OpenAI-powered agents that use APIs and tools with minimal setup. AgentKit provides a complete set of tools for building, deploying and optimising agents. It introduces Agent Builder, a visual canvas with drag‑and‑drop nodes and versioning; a Connector Registry for managing data sources (Dropbox, Google Drive, SharePoint, etc.); and ChatKit for embedding chat‑based agent interfaces. It also includes an Evals system with datasets, trace grading and automated prompt optimisation. Developers can enable guardrails to mask PII and detect jailbreaks. AgentKit is new but growing quickly; its tight integration with OpenAI models suits startups looking for rapid prototyping and built‑in evaluation.
📈 Adoption: Rapid developer uptake post-launch (similar trajectory to LangChain’s early phase).
🎯 Target Users: Developers and startups using OpenAI APIs.
☁️ Ecosystem Fit: Works best within OpenAI + Vercel + Azure ecosystem.
🧩 Best For: Teams already using OpenAI APIs who want an integrated, visual workflow builder and robust eval tools.
📌 Use Case:
Customer support agent calling internal APIs and generating reports.


Semantic Kernel (Microsoft)Memory and Planning for Enterprise AI

Overview:
Semantic Kernel provides memory, connectors, and planners for enterprise copilots in Office 365 and Azure. Semantic Kernel emphasises plugin‑based skills, memory abstractions and planners that break user requests into function calls. It integrates with Azure services and Microsoft 365, providing built‑in policy controls and type‑safe tools.
📈 Adoption: Widely used internally at Microsoft; open-source SDK seeing enterprise traction.
🎯 Target Users: Enterprise .NET and Python developers.
☁️ Ecosystem Fit: Microsoft 365, Copilot Studio, Azure AI.
🧩 Best For: Enterprises in the Microsoft ecosystem wanting to build AI copilots that interact with corporate data and apps.
📌 Use Case:
Personal productivity copilot integrating Outlook + Teams + Planner data.


🧬 AWS Strands SDKModel-Driven Multi-Agent Orchestration

Overview:
Strands enables multi-agent orchestration, guardrails, and observability on AWS, deeply integrated with Bedrock. Strands Agents SDK describes itself as a lightweight, production‑ready, code‑first framework. It emphasises a simple agent loop with full observability and tracing, supports both conversational and non‑conversational modes, and is model/provider/deployment agnostic. Strands supports multi‑agent collaboration and can run on AWS AgentCore, giving teams a managed runtime with session memory and Bedrock integration
📈 Adoption: Early-stage, growing rapidly within AWS and partners.
🎯 Target Users: Enterprises, regulated workloads, Bedrock users.
☁️ Ecosystem Fit: AWS-only; built for Bedrock and AgentCore integration.
🧩 Best For: Enterprises on AWS wanting a secure, scalable agent framework with built‑in observability.
📌 Use Case:
Compliance assistant querying S3, Lambda, and Bedrock Knowledge Bases.


Claude Agent SDK (Anthropic)Long-Context + Safe Autonomy

Overview:
Claude SDK offers subagents, context compaction, and permissioned tool use — built for safe, coherent reasoning. It is designed for long‑context reasoning and provides subagent orchestration and automatic context compaction so an agent can manage large conversations without exceeding the model’s context window. The SDK includes guardrails for permissioned tool use and robust session management
📈 Adoption: Fastest-growing closed SDK in 2025 (based on API metrics).
🎯 Target Users: Claude and RAG developers.
☁️ Ecosystem Fit: Anthropic ecosystem; compatible with AWS Bedrock.
🧩 Best For: Knowledge‑heavy applications where long context (100k tokens) and safety controls are critical.
📌 Use Case:
AI researcher analyzing long documents and summarizing findings with sources.


🧮 LlamaIndexThe RAG Powerhouse

Overview:
LlamaIndex bridges your data and LLMs via loaders, retrievers, and hybrid RAG pipelines. LlamaIndex (GPT Index) specialises in connecting data sources to LLMs. It offers a rich set of data connectors, advanced retrievers (hierarchical, sentence‑window, hybrid), and tools for compression and reranking. It’s widely used (millions of monthly downloads) and vendor‑agnostic, making it a go‑to for building retrieval‑augmented generation (RAG) systems.
📈 Adoption: Ubiquitous in RAG projects (10M+ downloads/month).
🎯 Target Users: Data engineers, applied AI builders.
☁️ Ecosystem Fit: Works across all clouds and vector DBs.
🧩 Best For: knowledge retrieval agents that need to ingest and index diverse data sets across clouds.
📌 Use Case:
Ingesting documents from S3 and building a contextual retrieval assistant.


🧾 Haystack (deepset)Pipeline Framework for RAG + Search

Overview:
An open-source alternative to proprietary RAG systems, built on Elasticsearch and Hugging Face. Haystack provides a pipeline‑oriented approach to RAG and question‑answering. It supports dense and sparse retrieval, flexible pipelines, and multimodal inputs, making it popular in enterprise search and Q&A deployments.
📈 Adoption: Mature in enterprise NLP projects.
🎯 Target Users: NLP and search engineers.
☁️ Ecosystem Fit: Cloud-agnostic, self-hosted or managed.
🧩 Best For: Teams wanting an open‑source, pipeline‑style framework for information retrieval and Q&A, regardless of cloud provider.
📌 Use Case:
Enterprise knowledge search with reranking and summarization nodes.


🧸 Smol Agents (Hugging Face)Tiny, Multimodal, Educational

Overview:
Simplest entry point for multimodal agents (text, image, audio) with minimal code.
📈 Adoption: Popular for education and hackathons.
🎯 Target Users: Students, hobbyists, educators.
☁️ Ecosystem Fit: Vendor-neutral (Hugging Face Hub).
📌 Use Case:
Multimodal content agent for social media creators.



🧭 4. Which Framework Should You Use? (Decision Matrix)

Use CaseRecommended FrameworkWhy
Complex orchestration & controlLangGraph / LlamaIndexVendor-neutral, scalable, great for research + infra
Long-context safe reasoningClaude SDK / StrandsSubagents, compaction, Bedrock integration
Cloud-native enterprise AIADK / Semantic Kernel / StrandsTight vendor orchestration, security, observability
Fastest production deploymentOpenAI AgentKit / Strands / AutoGenSDK-based, ready for production
Cross-cloud / hybrid deploymentLangGraph + LlamaIndexMost portable combination
Educational / experimentalSmol Agents / CrewAILightweight and open

🧩 5. Closing Thoughts

Framework choice isn’t just technical — it’s strategic.
If you’re all-in on AWS, Strands or Claude SDK (via Bedrock) gives you managed observability and scale.
If you’re vendor-agnostic or building your own RAG stack, LlamaIndex + LangGraph is the most future-proof path.
For fast iteration and shipping, OpenAI AgentKit or AutoGen delivers with minimal ops.

The good news? These ecosystems are converging fast — interoperability layers like LangGraph and LiteLLM mean you can mix and match frameworks as your system matures. Many teams use LlamaIndex for retrieval + LangGraph or AWS Strands for orchestration + AgentKit or Claude SDK for execution.

In the end, the “best” framework isn’t the one with the most stars—it’s the one that lets you ship faster, think clearer, and keep your agents grounded in truth. The next wave of AI won’t be about which model wins, but about who builds the best orchestration around it.

Leave a comment

Navigation

About

My Experiments with AI is where I explore the cutting edge of artificial intelligence through hands-on experimentation and thoughtful analysis.