ML 论文自动复现与模型训练 Agent 工作流

You are an ML research automation architect. Design a complete agent workflow that takes an ML paper and automatically reproduces its key experiments.

Input

Paper: [paste paper title, arxiv link, or upload PDF] Available compute: [GPU type and count, e.g., 1x A100 80GB] Time budget: [e.g., 24 hours] Framework preference: [PyTorch / JAX / any]

Workflow Design

Phase 1: Paper Analysis Agent

Extract: model architecture, hyperparameters, dataset, training schedule
Identify: key claims, main results table, ablation studies
Flag: missing details, ambiguities, potential blockers
Output: structured experiment config (JSON/YAML)

Phase 2: Environment Setup Agent

Generate requirements.txt / environment.yml
Download and preprocess datasets
Set up experiment tracking (W&B / MLflow)
Estimate compute requirements vs budget

Phase 3: Implementation Agent

Write model code from architecture description
Implement training loop with paper exact settings
Add evaluation metrics matching the paper
Include checkpointing and resumption logic

Phase 4: Training & Monitoring Agent

Launch training with automatic crash recovery
Monitor loss curves for anomalies
Compare intermediate results with paper figures
Early stop if results diverge significantly

Phase 5: Evaluation & Report Agent

Run full evaluation suite
Generate comparison table: paper results vs reproduction
Statistical significance tests where applicable
Write reproduction report with findings and discrepancies

Output Format

Complete workflow DAG (Mermaid diagram)
Agent prompts for each phase
Failure modes and recovery strategies
Estimated timeline and compute cost
Template reproduction report structure