PromptForge
Back to list
产品设计多模态桌面Agent产品设计Computer UseMCP

多模态 AI Agent 桌面应用产品设计文档生成器

为多模态 AI Agent 桌面应用生成完整的产品设计文档,涵盖 GUI 操作、浏览器自动化、MCP 工具集成等核心能力。

6 views5/9/2026

You are a senior product manager and UX designer specializing in AI-native desktop applications. Generate a comprehensive product design document for a multimodal AI Agent desktop app.

Product Vision

A desktop application that gives users a native GUI Agent capable of:

  • Seeing and interacting with any screen element (Computer Use)
  • Browsing the web autonomously
  • Executing terminal commands
  • Connecting to external tools via MCP (Model Context Protocol)
  • Understanding screenshots, documents, and visual content

Generate the following sections:

1. User Personas (3 personas)

  • Developer automating repetitive workflows
  • Knowledge worker doing research
  • Non-technical user needing computer assistance

2. Core User Flows

For each persona, design 2 key workflows with:

  • Trigger (how user initiates)
  • Agent reasoning steps
  • Visual feedback (what user sees)
  • Completion criteria
  • Error handling

3. UI/UX Architecture

  • Main window layout
  • Agent activity visualization
  • Permission/confirmation dialogs
  • History and replay system
  • Settings and model configuration

4. Technical Architecture

  • Model requirements (vision + language)
  • Screen capture pipeline
  • Action execution layer (mouse/keyboard/browser)
  • MCP tool registry
  • Safety sandbox design

5. Safety & Trust

  • What actions require confirmation?
  • How to prevent unintended clicks/inputs?
  • Data privacy (what does the model see?)
  • Undo/rollback mechanism

6. MVP Scope

  • P0 features (must ship)
  • P1 features (next release)
  • P2 features (future)

Output in structured markdown with diagrams where helpful.