This chapter focuses on practical applications of Practical Applications. You will learn essential concepts and techniques.
Practical Agent Applications
In this chapter, we will integrate the technologies learned in previous chapters to build agent systems that can be utilized in actual business scenarios.
Key Use Cases
| Use Case | Key Features | Business Value |
|---|---|---|
| Customer Service | FAQ responses, inquiry classification, escalation | 24/7 availability, cost reduction, satisfaction improvement |
| Code Generation | Requirements analysis, code generation, testing, debugging | Development speed improvement, quality enhancement |
| Research Assistant | Information gathering, analysis, report generation | Research time reduction, improved insight quality |
| Task Automation | Workflow execution, data processing | Business efficiency, error reduction |
Customer Service Agent
System Design
Implementation Example
from openai import OpenAI
from typing import Dict, Any, Optional
import json
class CustomerServiceAgent:
"""Customer service agent"""
def __init__(self, api_key: str):
self.client = OpenAI(api_key=api_key)
self.conversation_history = []
# Tool definitions
self.tools = [
{
"type": "function",
"function": {
"name": "search_faq",
"description": "Search for relevant information from FAQ database",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_order_status",
"description": "Check order status",
"parameters": {
"type": "object",
"properties": {
"order_id": {
"type": "string",
"description": "Order number"
}
},
"required": ["order_id"]
}
}
},
{
"type": "function",
"function": {
"name": "escalate_to_human",
"description": "Escalate to human operator",
"parameters": {
"type": "object",
"properties": {
"reason": {
"type": "string",
"description": "Reason for escalation"
},
"priority": {
"type": "string",
"enum": ["low", "medium", "high"],
"description": "Priority level"
}
},
"required": ["reason"]
}
}
}
]
self.system_prompt = """You are a helpful and competent customer service agent.
Response Policy:
1. Accurately understand the user's problem
2. Gather information using appropriate tools
3. Provide clear and helpful responses
4. Escalate complex problems or important matters to humans
Tone:
- Friendly and polite
- Show empathy
- Professional yet understandable
Constraints:
- Do not answer with speculation
- Handle personal information carefully
- Escalate when uncertain"""
def handle_inquiry(self, user_message: str) -> str:
"""Handle inquiry"""
self.conversation_history.append({
"role": "user",
"content": user_message
})
max_iterations = 5
for _ in range(max_iterations):
# Query LLM
response = self.client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": self.system_prompt}
] + self.conversation_history,
tools=self.tools,
tool_choice="auto",
temperature=0.7
)
message = response.choices[0].message
if not message.tool_calls:
# Final response
self.conversation_history.append({
"role": "assistant",
"content": message.content
})
return message.content
# Process tool calls
self.conversation_history.append(message)
for tool_call in message.tool_calls:
function_name = tool_call.function.name
function_args = json.loads(tool_call.function.arguments)
# Execute tool
if function_name == "search_faq":
result = self.search_faq(**function_args)
elif function_name == "get_order_status":
result = self.get_order_status(**function_args)
elif function_name == "escalate_to_human":
result = self.escalate_to_human(**function_args)
else:
result = {"error": "Unknown tool"}
# Add result
self.conversation_history.append({
"role": "tool",
"tool_call_id": tool_call.id,
"name": function_name,
"content": json.dumps(result, ensure_ascii=False)
})
return "We apologize for the delay. Let me transfer you to a representative."
def search_faq(self, query: str) -> Dict[str, Any]:
"""Search FAQ (mock)"""
faq_database = {
"return": {
"answer": "Returns are possible within 30 days of product receipt. Limited to unopened and unused items.",
"related_links": ["Return Policy", "Return Form"]
},
"shipping": {
"answer": "Typically delivered within 3-5 business days from order. Express shipping is also available for urgent needs.",
"related_links": ["Shipping Options", "Shipping Fees"]
},
"payment": {
"answer": "We accept credit cards, bank transfers, and convenience store payments.",
"related_links": ["Payment Methods", "Payment Security"]
}
}
# Simple keyword matching
for key, value in faq_database.items():
if key in query.lower():
return {
"found": True,
"answer": value["answer"],
"related_links": value["related_links"]
}
return {
"found": False,
"message": "No matching FAQ found"
}
def get_order_status(self, order_id: str) -> Dict[str, Any]:
"""Get order status (mock)"""
# Would actually retrieve from database
mock_orders = {
"ORD-12345": {
"status": "In transit",
"tracking_number": "TRK-98765",
"estimated_delivery": "October 27, 2025"
},
"ORD-67890": {
"status": "Preparing",
"estimated_shipping": "October 26, 2025"
}
}
if order_id in mock_orders:
return {
"found": True,
"order_id": order_id,
**mock_orders[order_id]
}
else:
return {
"found": False,
"message": f"Order number {order_id} not found"
}
def escalate_to_human(self, reason: str, priority: str = "medium") -> Dict[str, Any]:
"""Escalate to human (mock)"""
# Would actually send to ticket system or CRM
ticket_id = f"TICKET-{int(time.time())}"
return {
"escalated": True,
"ticket_id": ticket_id,
"priority": priority,
"estimated_response": "Within 30 minutes"
}
# Usage example
import time
agent = CustomerServiceAgent(api_key="your-api-key")
# Example 1: FAQ question
response1 = agent.handle_inquiry("Please tell me about the return policy")
print(f"Agent: {response1}\n")
# Example 2: Order status check
response2 = agent.handle_inquiry("Please tell me the shipping status of order number ORD-12345")
print(f"Agent: {response2}\n")
# Example 3: Complex issue (escalation)
response3 = agent.handle_inquiry("The product I ordered was damaged. I want a refund")
print(f"Agent: {response3}")
Code Generation Agent
Multi-Stage Code Generation
from openai import OpenAI
from typing import Dict, Any, List
class CodeGenerationAgent:
"""Code generation agent system"""
def __init__(self, api_key: str):
self.client = OpenAI(api_key=api_key)
def generate_code(self, requirement: str) -> Dict[str, Any]:
"""Generate code from requirements"""
# Step 1: Requirements analysis
analysis = self.analyze_requirements(requirement)
# Step 2: Code generation
code = self.generate_implementation(analysis)
# Step 3: Test generation
tests = self.generate_tests(code, analysis)
# Step 4: Review
review = self.review_code(code, tests)
return {
"analysis": analysis,
"code": code,
"tests": tests,
"review": review
}
def analyze_requirements(self, requirement: str) -> Dict[str, Any]:
"""Analyze requirements"""
prompt = f"""Analyze the following requirements and output in JSON format.
Requirements: {requirement}
Output format:
{{
"functionality": "Description of main functionality",
"inputs": ["input1", "input2"],
"outputs": ["output1", "output2"],
"edge_cases": ["edge case1", "edge case2"],
"suggested_approach": "Implementation approach"
}}"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0
)
return json.loads(response.choices[0].message.content)
def generate_implementation(self, analysis: Dict[str, Any]) -> str:
"""Generate implementation code"""
prompt = f"""Generate Python code based on the following analysis results.
Analysis results:
{json.dumps(analysis, indent=2, ensure_ascii=False)}
Requirements:
- Use type hints
- Include docstrings
- Implement error handling
- Follow PEP 8
Output code only:"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.3
)
return response.choices[0].message.content
def generate_tests(self, code: str, analysis: Dict[str, Any]) -> str:
"""Generate test code"""
prompt = f"""Generate pytest test cases for the following code.
Code:
{code}
Analysis results:
{json.dumps(analysis, indent=2, ensure_ascii=False)}
Requirements:
- Tests for normal cases
- Tests for edge cases
- Tests for error cases
- Aim for 80%+ test coverage
Output pytest code only:"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.3
)
return response.choices[0].message.content
def review_code(self, code: str, tests: str) -> Dict[str, Any]:
"""Review code"""
prompt = f"""Review the following code and tests, and evaluate in JSON format.
Code:
{code}
Tests:
{tests}
Evaluation criteria:
- Readability (1-10)
- Maintainability (1-10)
- Performance (1-10)
- Security (1-10)
- Test coverage (1-10)
Output format:
{{
"scores": {{"readability": 8, "maintainability": 7, ...}},
"strengths": ["strength1", "strength2"],
"improvements": ["improvement1", "improvement2"],
"overall_assessment": "Overall assessment"
}}"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.5
)
return json.loads(response.choices[0].message.content)
# Usage example
agent = CodeGenerationAgent(api_key="your-api-key")
requirement = """
Create an email address validation function.
Must meet the following requirements:
- RFC 5322 compliant
- Support common formats
- Optional domain existence check
"""
result = agent.generate_code(requirement)
print("=== Requirements Analysis ===")
print(json.dumps(result["analysis"], indent=2, ensure_ascii=False))
print("\n=== Generated Code ===")
print(result["code"])
print("\n=== Test Code ===")
print(result["tests"])
print("\n=== Code Review ===")
print(json.dumps(result["review"], indent=2, ensure_ascii=False))
Research Agent
Automatic Research Report Generation
# Requirements:
# - Python 3.9+
# - requests>=2.31.0
from openai import OpenAI
import requests
from typing import List, Dict, Any
class ResearchAgent:
"""Research agent"""
def __init__(self, api_key: str, serp_api_key: str = None):
self.client = OpenAI(api_key=api_key)
self.serp_api_key = serp_api_key
def research(self, topic: str, depth: str = "medium") -> Dict[str, Any]:
"""Research topic"""
# Step 1: Generate research queries
queries = self.generate_search_queries(topic, depth)
# Step 2: Gather information
search_results = self.gather_information(queries)
# Step 3: Analyze information
analysis = self.analyze_information(topic, search_results)
# Step 4: Generate report
report = self.generate_report(topic, analysis)
return {
"topic": topic,
"queries": queries,
"sources": len(search_results),
"analysis": analysis,
"report": report
}
def generate_search_queries(self, topic: str, depth: str) -> List[str]:
"""Generate search queries"""
num_queries = {"shallow": 3, "medium": 5, "deep": 8}[depth]
prompt = f"""Generate {num_queries} search queries for researching the topic "{topic}".
Requirements:
- Queries from different perspectives
- Both latest information and background information
- Specific and search-efficient queries
Output in JSON array format: ["query1", "query2", ...]"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.7
)
return json.loads(response.choices[0].message.content)
def gather_information(self, queries: List[str]) -> List[Dict[str, Any]]:
"""Gather information"""
results = []
for query in queries:
# Would actually use SerpAPI or similar
# Using mock data here
results.append({
"query": query,
"title": f"Information about {query}",
"snippet": f"Detailed information about {query}...",
"source": "example.com",
"relevance_score": 0.85
})
return results
def analyze_information(self, topic: str, results: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Analyze information"""
# Combine results
combined_info = "\n\n".join([
f"【{r['query']}】\n{r['snippet']}"
for r in results
])
prompt = f"""Analyze the following information and summarize the topic "{topic}".
Collected information:
{combined_info}
Output in the following JSON format:
{{
"key_findings": ["finding1", "finding2", "finding3"],
"trends": ["trend1", "trend2"],
"challenges": ["challenge1", "challenge2"],
"opportunities": ["opportunity1", "opportunity2"],
"summary": "Overall summary"
}}"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.5
)
return json.loads(response.choices[0].message.content)
def generate_report(self, topic: str, analysis: Dict[str, Any]) -> str:
"""Generate report"""
prompt = f"""Generate a research report on the topic "{topic}" based on the following analysis results.
Analysis results:
{json.dumps(analysis, indent=2, ensure_ascii=False)}
Report requirements:
- Executive summary
- Key findings
- Trend analysis
- Challenges and opportunities
- Conclusions and recommendations
- Approximately 1500-2000 characters
Output in professional business report format:"""
response = self.client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}],
temperature=0.7
)
return response.choices[0].message.content
# Usage example
agent = ResearchAgent(api_key="your-api-key")
result = agent.research("AI Agent Technology Trends in 2024", depth="medium")
print("=== Research Report ===")
print(result["report"])
print(f"\nNumber of sources: {result['sources']}")
print(f"Queries used: {', '.join(result['queries'])}")
Task Automation Agent
Workflow Automation
from typing import List, Dict, Any, Callable
from dataclasses import dataclass
from enum import Enum
class TaskStatus(Enum):
PENDING = "pending"
RUNNING = "running"
COMPLETED = "completed"
FAILED = "failed"
@dataclass
class Task:
"""Task definition"""
id: str
name: str
action: Callable
dependencies: List[str]
status: TaskStatus = TaskStatus.PENDING
result: Any = None
error: str = None
class WorkflowAutomationAgent:
"""Workflow automation agent"""
def __init__(self):
self.tasks: Dict[str, Task] = {}
self.execution_log: List[Dict[str, Any]] = []
def add_task(self, task: Task):
"""Add task"""
self.tasks[task.id] = task
def execute_workflow(self) -> Dict[str, Any]:
"""Execute workflow"""
# Determine execution order via topological sort
execution_order = self._topological_sort()
for task_id in execution_order:
task = self.tasks[task_id]
# Check dependent tasks are completed
if not self._dependencies_satisfied(task):
task.status = TaskStatus.FAILED
task.error = "Dependent tasks not completed"
continue
# Execute task
try:
task.status = TaskStatus.RUNNING
self._log_event(f"Task started: {task.name}")
# Get results of dependent tasks
dep_results = {
dep_id: self.tasks[dep_id].result
for dep_id in task.dependencies
}
# Execute task
task.result = task.action(dep_results)
task.status = TaskStatus.COMPLETED
self._log_event(f"Task completed: {task.name}")
except Exception as e:
task.status = TaskStatus.FAILED
task.error = str(e)
self._log_event(f"Task failed: {task.name} - {str(e)}")
return self._generate_summary()
def _dependencies_satisfied(self, task: Task) -> bool:
"""Check if all dependent tasks are completed"""
for dep_id in task.dependencies:
if dep_id not in self.tasks:
return False
if self.tasks[dep_id].status != TaskStatus.COMPLETED:
return False
return True
def _topological_sort(self) -> List[str]:
"""Determine task execution order (topological sort)"""
# Simple implementation (tasks with dependencies later)
sorted_tasks = []
visited = set()
def visit(task_id):
if task_id in visited:
return
visited.add(task_id)
task = self.tasks[task_id]
for dep_id in task.dependencies:
if dep_id in self.tasks:
visit(dep_id)
sorted_tasks.append(task_id)
for task_id in self.tasks:
visit(task_id)
return sorted_tasks
def _log_event(self, message: str):
"""Record event to log"""
import datetime
self.execution_log.append({
"timestamp": datetime.datetime.now().isoformat(),
"message": message
})
def _generate_summary(self) -> Dict[str, Any]:
"""Generate execution results summary"""
total = len(self.tasks)
completed = sum(1 for t in self.tasks.values() if t.status == TaskStatus.COMPLETED)
failed = sum(1 for t in self.tasks.values() if t.status == TaskStatus.FAILED)
return {
"total_tasks": total,
"completed": completed,
"failed": failed,
"success_rate": completed / total if total > 0 else 0,
"execution_log": self.execution_log
}
# Usage example: Data processing workflow
def fetch_data(deps):
"""Fetch data"""
print("Fetching data...")
return {"data": [1, 2, 3, 4, 5]}
def clean_data(deps):
"""Clean data"""
data = deps["fetch_data"]["data"]
print(f"Cleaning data: {data}")
cleaned = [x * 2 for x in data]
return {"cleaned_data": cleaned}
def analyze_data(deps):
"""Analyze data"""
cleaned = deps["clean_data"]["cleaned_data"]
print(f"Analyzing data: {cleaned}")
avg = sum(cleaned) / len(cleaned)
return {"average": avg, "count": len(cleaned)}
def generate_report(deps):
"""Generate report"""
analysis = deps["analyze_data"]
print(f"Generating report...")
report = f"Average: {analysis['average']}, Data count: {analysis['count']}"
return {"report": report}
# Build workflow
workflow = WorkflowAutomationAgent()
workflow.add_task(Task(
id="fetch_data",
name="Fetch Data",
action=fetch_data,
dependencies=[]
))
workflow.add_task(Task(
id="clean_data",
name="Clean Data",
action=clean_data,
dependencies=["fetch_data"]
))
workflow.add_task(Task(
id="analyze_data",
name="Analyze Data",
action=analyze_data,
dependencies=["clean_data"]
))
workflow.add_task(Task(
id="generate_report",
name="Generate Report",
action=generate_report,
dependencies=["analyze_data"]
))
# Execute
summary = workflow.execute_workflow()
print("\n=== Execution Summary ===")
print(f"Total tasks: {summary['total_tasks']}")
print(f"Completed: {summary['completed']}")
print(f"Failed: {summary['failed']}")
print(f"Success rate: {summary['success_rate']*100:.1f}%")
# Output report
report_task = workflow.tasks["generate_report"]
if report_task.status == TaskStatus.COMPLETED:
print(f"\nFinal report: {report_task.result['report']}")
Evaluation and Monitoring
Measuring Agent Performance
from typing import Dict, Any, List
from dataclasses import dataclass
from datetime import datetime
@dataclass
class AgentMetrics:
"""Agent metrics"""
task_id: str
success: bool
latency: float # seconds
token_usage: int
cost: float # USD
user_satisfaction: float # 1-5
error_type: str = None
class AgentEvaluator:
"""Agent evaluation system"""
def __init__(self):
self.metrics: List[AgentMetrics] = []
def record_execution(self, metrics: AgentMetrics):
"""Record execution metrics"""
self.metrics.append(metrics)
def generate_report(self) -> Dict[str, Any]:
"""Generate evaluation report"""
if not self.metrics:
return {"error": "No metrics data"}
total = len(self.metrics)
successful = sum(1 for m in self.metrics if m.success)
return {
"overview": {
"total_executions": total,
"success_rate": successful / total,
"avg_latency": sum(m.latency for m in self.metrics) / total,
"total_cost": sum(m.cost for m in self.metrics),
"avg_satisfaction": sum(m.user_satisfaction for m in self.metrics) / total
},
"performance": self._analyze_performance(),
"errors": self._analyze_errors(),
"recommendations": self._generate_recommendations()
}
def _analyze_performance(self) -> Dict[str, Any]:
"""Analyze performance"""
latencies = [m.latency for m in self.metrics]
return {
"p50_latency": sorted(latencies)[len(latencies)//2],
"p95_latency": sorted(latencies)[int(len(latencies)*0.95)],
"max_latency": max(latencies)
}
def _analyze_errors(self) -> Dict[str, Any]:
"""Analyze errors"""
errors = [m for m in self.metrics if not m.success]
error_types = {}
for error in errors:
error_type = error.error_type or "unknown"
error_types[error_type] = error_types.get(error_type, 0) + 1
return {
"total_errors": len(errors),
"error_distribution": error_types
}
def _generate_recommendations(self) -> List[str]:
"""Generate improvement recommendations"""
recommendations = []
# Check success rate
success_rate = sum(1 for m in self.metrics if m.success) / len(self.metrics)
if success_rate < 0.95:
recommendations.append("Success rate below 95%. Strengthen error handling")
# Check latency
avg_latency = sum(m.latency for m in self.metrics) / len(self.metrics)
if avg_latency > 5.0:
recommendations.append("Average latency exceeds 5 seconds. Consider optimization")
# Check cost
total_cost = sum(m.cost for m in self.metrics)
if total_cost > 100:
recommendations.append(f"Total cost is ${total_cost:.2f}. Consider cost optimization")
return recommendations
# Usage example
evaluator = AgentEvaluator()
# Record metrics
evaluator.record_execution(AgentMetrics(
task_id="task1",
success=True,
latency=2.3,
token_usage=500,
cost=0.01,
user_satisfaction=4.5
))
evaluator.record_execution(AgentMetrics(
task_id="task2",
success=False,
latency=10.5,
token_usage=1000,
cost=0.02,
user_satisfaction=2.0,
error_type="timeout"
))
# Generate report
report = evaluator.generate_report()
print(json.dumps(report, indent=2, ensure_ascii=False))
Production Considerations
Scalability and Cost Optimization
| Consideration | Challenge | Solution |
|---|---|---|
| Scalability | Increased concurrent requests | Asynchronous processing, queuing, horizontal scaling |
| Cost | Growing API costs | Caching, model selection, prompt optimization |
| Reliability | API failures, timeouts | Retry, fallback, circuit breaker |
| Security | Data leakage, unauthorized use | Authentication, encryption, rate limiting, audit logs |
| Monitoring | Early problem detection | Metrics collection, alerts, dashboards |
Best Practices
- ✅ Caching: Cache results of identical queries to reduce API calls
- ✅ Asynchronous Processing: Implement parallel processing with asyncio
- ✅ Rate Limiting: Respect API usage limits
- ✅ Error Handling: Retry and fallback strategies
- ✅ Logging and Monitoring: Detailed log recording and performance monitoring
- ✅ Cost Tracking: Visualize token usage and costs
- ✅ Security: Authentication, encryption, input validation
- ✅ Testing: Unit tests, integration tests, E2E tests
Summary
What We Learned in This Chapter
- ✅ Customer Service: FAQ responses, inquiry classification, escalation
- ✅ Code Generation: Requirements analysis, implementation generation, testing, review
- ✅ Research Agent: Information gathering, analysis, report generation
- ✅ Task Automation: Workflow execution, dependency management
- ✅ Evaluation and Monitoring: Metrics collection, performance analysis
- ✅ Production: Scalability, cost, reliability
Series Summary
AI agents combine the reasoning capabilities of LLMs with tool usage to autonomously solve complex tasks. Effective agent systems are realized through clear role definition, robust error handling, and appropriate evaluation and monitoring.
Next Steps
Having completed this series, you can now design and implement practical AI agent systems. To deepen your learning further:
- 📚 Explore AutoGPT/BabyAGI to understand autonomous agents
- 📚 Build advanced agents with LangChain/LlamaIndex
- 📚 Optimize agents with reinforcement learning
- 📚 Apply learned techniques in real projects
Disclaimer
- This content is provided solely for educational, research, and informational purposes and does not constitute professional advice (legal, accounting, technical warranty, etc.).
- This content and accompanying code examples are provided "AS IS" without any warranty, express or implied, including but not limited to merchantability, fitness for a particular purpose, non-infringement, accuracy, completeness, operation, or safety.
- The author and Tohoku University assume no responsibility for the content, availability, or safety of external links, third-party data, tools, libraries, etc.
- To the maximum extent permitted by applicable law, the author and Tohoku University shall not be liable for any direct, indirect, incidental, special, consequential, or punitive damages arising from the use, execution, or interpretation of this content.
- The content may be changed, updated, or discontinued without notice.
- The copyright and license of this content are subject to the stated conditions (e.g., CC BY 4.0). Such licenses typically include no-warranty clauses.