Framework Deep Dive: CrewAI - Role-Based Multi-Agent Orchestration
An in-depth exploration of CrewAI's role-based architecture, crew orchestration patterns, task delegation, and production best practices for building collaborative AI agent teams
LangChain has become synonymous with LLM application development. Since its launch in late 2022, it has grown from a simple chain-of-thought library into a comprehensive ecosystem powering thousands of production AI applications. (For context on how LangChain compares to other frameworks, see our Complete Guide to AI Agent Frameworks.) This deep dive explores LangChain’s architecture, examines its core components, and provides practical patterns for building robust AI agents.
Understanding LangChain requires looking at its modular architecture. What started as a single library has evolved into a family of interconnected packages:
langchain-core: The foundation containing base abstractions—messages, prompts, output parsers, and runnables. This package has minimal dependencies and defines the interfaces that other packages implement.
langchain: High-level chains, agents, and orchestration logic. This is where you find AgentExecutor, the various chain types, and retrieval patterns.
langchain-community: Third-party integrations contributed by the community. Document loaders, vector stores, and tool implementations live here.
Partner packages: First-party integrations like langchain-openai, langchain-anthropic, and langchain-google-vertexai provide optimized, well-maintained connections to major providers.
LangGraph: A separate but complementary library for graph-based agent orchestration, covered in its own deep dive.
LangSmith: The observability platform for tracing, debugging, and evaluating LangChain applications.
This modular approach allows you to install only what you need, reducing dependency bloat and improving startup times.
LangChain’s message system provides a unified interface across providers. Whether you’re using OpenAI, Anthropic, or a local model, you work with the same message types:
from langchain_core.messages import (
SystemMessage,
HumanMessage,
AIMessage,
ToolMessage
)
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
# Messages work identically across providers
messages = [
SystemMessage(content="You are a helpful coding assistant."),
HumanMessage(content="Explain Python decorators"),
AIMessage(content="Decorators are functions that modify..."),
HumanMessage(content="Show me an example")
]
# Same messages, different providers
openai_response = ChatOpenAI(model="gpt-4o").invoke(messages)
anthropic_response = ChatAnthropic(model="claude-3-5-sonnet-20241022").invoke(messages)
This abstraction makes it trivial to experiment with different models or implement fallback strategies when one provider experiences issues.
Prompt templates separate your prompt logic from variable content, enabling reusable and maintainable prompts:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
# Basic template with variables
analysis_prompt = ChatPromptTemplate.from_messages([
("system", "You are an expert {domain} analyst. Respond in {language}."),
("human", "Analyze the following: {content}")
])
# Invoke with variables
formatted = analysis_prompt.invoke({
"domain": "security",
"language": "English",
"content": "This function uses eval() on user input..."
})
# Template with message history placeholder
conversational_prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("human", "{input}")
])
The MessagesPlaceholder is particularly powerful for conversational agents, allowing you to inject conversation history at exactly the right position in your prompt.
LangChain Expression Language (LCEL) provides a declarative way to compose components into processing pipelines:
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# Composing a simple chain
chain = prompt | ChatOpenAI(model="gpt-4o") | StrOutputParser()
# Invoke the chain
result = chain.invoke({"domain": "data", "content": "SELECT * FROM users"})
# Parallel execution with RunnableParallel
from langchain_core.runnables import RunnableParallel
parallel_chain = RunnableParallel({
"summary": summary_chain,
"sentiment": sentiment_chain,
"keywords": keyword_chain
})
# All three chains run concurrently
results = parallel_chain.invoke({"text": long_document})
LCEL chains are inherently streamable, batchable, and async-compatible without additional code. Every runnable automatically supports .stream(), .batch(), and .ainvoke().
Tools are the interface through which agents interact with the external world. LangChain provides several ways to create them:
from langchain_core.tools import tool, StructuredTool
from pydantic import BaseModel, Field
# Simple decorator approach
@tool
def search_database(query: str) -> str:
"""Search the product database for matching items."""
# Your implementation here
return f"Found 5 products matching '{query}'"
# Structured tool with explicit schema
class EmailInput(BaseModel):
recipient: str = Field(description="Email address of recipient")
subject: str = Field(description="Email subject line")
body: str = Field(description="Email body content")
@tool(args_schema=EmailInput)
def send_email(recipient: str, subject: str, body: str) -> str:
"""Send an email to a specified recipient."""
# Your implementation here
return f"Email sent to {recipient}"
# Creating tools from existing functions
def calculate_shipping(weight: float, destination: str) -> float:
"""Calculate shipping cost based on weight and destination."""
base_rate = 5.0
per_kg = 2.5
return base_rate + (weight * per_kg)
shipping_tool = StructuredTool.from_function(
func=calculate_shipping,
name="shipping_calculator",
description="Calculate shipping costs"
)
The docstring is crucial—it’s what the LLM reads to understand when and how to use each tool. Write clear, specific descriptions.
The AgentExecutor orchestrates the reasoning loop, managing tool calls and responses:
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
# Define your tools
tools = [search_database, send_email, shipping_tool]
# Create the prompt
prompt = ChatPromptTemplate.from_messages([
("system", """You are a customer service agent for an e-commerce platform.
You can search for products, send emails, and calculate shipping.
Always confirm before sending emails."""),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
# Initialize the LLM with tool support
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Create the agent
agent = create_tool_calling_agent(llm, tools, prompt)
# Wrap in executor for the reasoning loop
executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Shows reasoning steps
max_iterations=10, # Prevent infinite loops
handle_parsing_errors=True # Graceful error recovery
)
# Run the agent
result = executor.invoke({
"input": "Find me wireless headphones under $100 and email the results to [email protected]"
})
For multi-turn conversations, you need to maintain history. LangChain provides several memory implementations:
from langchain.memory import ConversationBufferMemory, ConversationSummaryMemory
from langchain_openai import ChatOpenAI
# Simple buffer memory - stores all messages
buffer_memory = ConversationBufferMemory(
memory_key="history",
return_messages=True
)
# Summary memory - condenses old messages to save tokens
summary_memory = ConversationSummaryMemory(
llm=ChatOpenAI(model="gpt-4o-mini"),
memory_key="history",
return_messages=True
)
# Using memory with an agent
executor_with_memory = AgentExecutor(
agent=agent,
tools=tools,
memory=buffer_memory,
verbose=True
)
# First turn
executor_with_memory.invoke({"input": "My name is Sarah"})
# Second turn - agent remembers
executor_with_memory.invoke({"input": "What's my name?"})
# Output will correctly recall "Sarah"
For production applications, consider using persistent memory stores that survive application restarts.
LangChain excels at RAG applications, connecting LLMs to external knowledge:
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_community.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
# Load documents
loader = DirectoryLoader("./docs", glob="**/*.md")
documents = loader.load()
# Split into chunks
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = splitter.split_documents(documents)
# Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(chunks, embeddings, persist_directory="./chroma_db")
# Create retriever
retriever = vectorstore.as_retriever(
search_type="mmr", # Maximum marginal relevance for diversity
search_kwargs={"k": 5}
)
# Build RAG chain
prompt = ChatPromptTemplate.from_messages([
("system", "Answer based on the following context:\n\n{context}"),
("human", "{input}")
])
combine_docs_chain = create_stuff_documents_chain(
llm=ChatOpenAI(model="gpt-4o"),
prompt=prompt
)
rag_chain = create_retrieval_chain(retriever, combine_docs_chain)
# Query
result = rag_chain.invoke({"input": "How do I configure authentication?"})
print(result["answer"])
For comprehensive production deployment guidance, see our Building Production AI Agents guide. Below are LangChain-specific patterns.
Production applications need graceful failure handling:
from langchain_core.runnables import RunnableWithFallbacks
from langchain_openai import ChatOpenAI
from langchain_anthropic import ChatAnthropic
# Primary and fallback models
primary = ChatOpenAI(model="gpt-4o")
fallback = ChatAnthropic(model="claude-3-5-sonnet-20241022")
# Chain with fallback
robust_llm = primary.with_fallbacks([fallback])
# Custom retry logic
from tenacity import retry, stop_after_attempt, wait_exponential
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
)
def invoke_with_retry(chain, input_data):
return chain.invoke(input_data)
For better user experience, stream responses as they’re generated:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", streaming=True)
# Stream tokens
for chunk in llm.stream("Explain quantum computing"):
print(chunk.content, end="", flush=True)
# Stream with chain
chain = prompt | llm | StrOutputParser()
for chunk in chain.stream({"topic": "AI agents"}):
print(chunk, end="", flush=True)
LangSmith provides essential visibility into your LangChain applications:
import os
# Enable tracing
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-agent-project"
# All LangChain calls are now traced
result = chain.invoke({"input": "test query"})
# Custom metadata
from langchain_core.tracers import LangChainTracer
from langchain_core.callbacks import CallbackManager
tracer = LangChainTracer(project_name="production-agents")
callbacks = CallbackManager([tracer])
result = chain.invoke(
{"input": "test"},
config={"callbacks": callbacks, "tags": ["production", "v2"]}
)
LangSmith traces show every LLM call, tool invocation, and intermediate step, making debugging straightforward.
Force the LLM to return data in a specific format:
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
class ProductAnalysis(BaseModel):
name: str = Field(description="Product name")
category: str = Field(description="Product category")
sentiment: str = Field(description="Overall sentiment: positive, negative, or neutral")
key_points: list[str] = Field(description="Key points from the review")
llm = ChatOpenAI(model="gpt-4o")
structured_llm = llm.with_structured_output(ProductAnalysis)
result = structured_llm.invoke("Analyze this review: The new headphones are amazing...")
# result is a ProductAnalysis instance with typed fields
Route inputs to specialized chains based on content:
from langchain_core.runnables import RunnableLambda
def route_query(input_dict):
query = input_dict["query"].lower()
if "price" in query or "cost" in query:
return pricing_chain
elif "technical" in query or "spec" in query:
return technical_chain
else:
return general_chain
router = RunnableLambda(route_query)
full_chain = {"query": lambda x: x["query"]} | router
result = full_chain.invoke({"query": "What's the price of the Pro model?"})
LangChain shines when you need:
Consider alternatives when:
LangChain provides a comprehensive toolkit for building AI agents and LLM applications. Its modular architecture, extensive integrations, and production-ready features make it an excellent choice for both prototypes and production systems.
The key to success with LangChain is understanding its abstractions. Messages provide cross-provider compatibility, prompts separate logic from content, and LCEL enables powerful composition. Master these concepts, and you’ll build robust AI applications efficiently.
In the next installment of this series, we’ll explore Microsoft AutoGen and its unique approach to multi-agent collaboration through conversation-based coordination.
This post is part of our Framework Deep Dive series, exploring the architectures and patterns of major AI agent frameworks. Next up: AutoGen Deep Dive.
An in-depth exploration of CrewAI's role-based architecture, crew orchestration patterns, task delegation, and production best practices for building collaborative AI agent teams
An in-depth exploration of Microsoft's AutoGen framework, its conversation-based multi-agent architecture, team patterns, and production best practices
Learn how to build custom tools that extend your LangChain agents' capabilities with this step-by-step guide including practical examples for API integration, data processing, and more