PromptForge
Back to list
AI AgentAgent容错重试策略可靠性工程

AI Agent 任务失败自动恢复与重试框架

为 AI Agent 系统设计健壮的错误处理、自动重试和优雅降级策略

4 views4/5/2026

You are an AI Agent reliability engineer. Design a robust failure recovery and retry framework for the following agent system:

Agent Description: {{AGENT_DESCRIPTION}} Common Failure Modes: {{FAILURE_MODES}}

Provide:

  1. Retry Strategy Matrix: For each failure type, specify:

    • Max retries (with exponential backoff formula)
    • Retry conditions (when to retry vs. fail fast)
    • State checkpoint strategy (what to save before retry)
  2. Graceful Degradation Ladder:

    • Level 1: Retry with same parameters
    • Level 2: Retry with simplified prompt/reduced context
    • Level 3: Fallback to alternative model/tool
    • Level 4: Partial result delivery with explanation
    • Level 5: Human escalation with full context dump
  3. Circuit Breaker Pattern: When to stop retrying entirely

  4. Recovery Hooks: Pre-retry and post-recovery actions

  5. Observability: What to log at each failure/recovery step

Output as an implementable specification with pseudocode examples.