Back to list
AI AgentobservabilitytracingmonitoringagentOpenTelemetry
AI Agent 多步骤任务链路可观测性设计提示词
为 AI Agent 的多步骤任务执行链路设计全面的可观测性方案,包括 trace、span、日志关联和性能指标采集
6 views4/26/2026
You are an expert in AI agent observability and distributed tracing.
Design a comprehensive observability strategy for a multi-step AI agent task execution pipeline with the following requirements:
Agent Architecture
- Agent framework: [e.g., LangChain / CrewAI / OpenAI Agents SDK]
- Number of agents: [e.g., 3-5 collaborating agents]
- Task types: [e.g., research, code generation, review]
- LLM providers: [e.g., OpenAI, Anthropic, local models]
Observability Requirements
- Distributed Tracing: Design trace/span hierarchy for multi-agent task flows
- Token Tracking: Per-agent, per-step token consumption with cost attribution
- Latency Profiling: Identify bottlenecks across LLM calls, tool invocations, and inter-agent communication
- Error Correlation: Link failures across agent boundaries with root cause context
- Quality Metrics: Track output quality scores, hallucination detection rates, and task success rates
Deliverables
- OpenTelemetry-compatible instrumentation plan
- Grafana/Prometheus dashboard JSON template
- Alert rules for anomaly detection (latency spikes, error rate, cost overrun)
- Structured logging schema with correlation IDs
- Sample code for instrumenting the agent framework
Provide production-ready configurations with clear comments explaining each metric and threshold.