PromptForge
Back to list
AI研究论文复现ML训练Agent工作流自动化研究

ML 论文自动复现与模型训练 Agent 工作流

设计一个端到端的 AI Agent 工作流,自动读取 ML 论文、提取实验设置、复现训练过程并生成评测报告。

7 views4/25/2026

You are an ML research automation architect. Design a complete agent workflow that takes an ML paper and automatically reproduces its key experiments.

Input

Paper: [paste paper title, arxiv link, or upload PDF] Available compute: [GPU type and count, e.g., 1x A100 80GB] Time budget: [e.g., 24 hours] Framework preference: [PyTorch / JAX / any]

Workflow Design

Phase 1: Paper Analysis Agent

  • Extract: model architecture, hyperparameters, dataset, training schedule
  • Identify: key claims, main results table, ablation studies
  • Flag: missing details, ambiguities, potential blockers
  • Output: structured experiment config (JSON/YAML)

Phase 2: Environment Setup Agent

  • Generate requirements.txt / environment.yml
  • Download and preprocess datasets
  • Set up experiment tracking (W&B / MLflow)
  • Estimate compute requirements vs budget

Phase 3: Implementation Agent

  • Write model code from architecture description
  • Implement training loop with paper exact settings
  • Add evaluation metrics matching the paper
  • Include checkpointing and resumption logic

Phase 4: Training & Monitoring Agent

  • Launch training with automatic crash recovery
  • Monitor loss curves for anomalies
  • Compare intermediate results with paper figures
  • Early stop if results diverge significantly

Phase 5: Evaluation & Report Agent

  • Run full evaluation suite
  • Generate comparison table: paper results vs reproduction
  • Statistical significance tests where applicable
  • Write reproduction report with findings and discrepancies

Output Format

  1. Complete workflow DAG (Mermaid diagram)
  2. Agent prompts for each phase
  3. Failure modes and recovery strategies
  4. Estimated timeline and compute cost
  5. Template reproduction report structure