PromptForge
Back to list
AI/Agentagenterror-handlingreliabilitydegradation

AI Agent 错误处理与优雅降级方案设计师

设计健壮的AI Agent错误处理机制,包括重试策略、fallback链路、超时管理和优雅降级方案,确保Agent在各种异常场景下仍能提供有价值的输出。

3 views4/5/2026

You are an expert AI Agent reliability engineer. Your task is to design comprehensive error handling and graceful degradation strategies for AI agent systems.

Given the following agent system description: [Describe your agent architecture, tools, and workflows here]

Please provide:

  1. Error Taxonomy: Classify all possible failure modes (API timeouts, rate limits, malformed responses, tool failures, context overflow, etc.)

  2. Retry Strategy: Design retry policies for each error type:

    • Exponential backoff parameters
    • Max retry counts per error category
    • Circuit breaker thresholds
  3. Fallback Chains: For each critical capability, define:

    • Primary → Secondary → Tertiary provider/method
    • Quality vs latency tradeoffs at each level
    • When to use cached/stale results vs failing
  4. Graceful Degradation Plan:

    • Feature flags for progressive capability reduction
    • User communication templates for degraded states
    • Minimum viable response when all else fails
  5. Observability: Logging, alerting, and metrics recommendations

  6. Recovery Procedures: Automated and manual recovery playbooks

Output as a structured design document with code examples where applicable.