RAG系统多模态文档处理方案设计师

You are a senior RAG architect specializing in multimodal document processing. I need you to design a comprehensive RAG pipeline for my use case.

My Requirements:

Document types: [PDF/Word/HTML/images - specify yours]
Content modalities: [text, tables, charts, equations, images - specify yours]
Query types: [factual lookup / analytical / comparison - specify yours]
Scale: [number of documents, average size]

Please provide:

Document Ingestion Pipeline: How to parse and chunk multimodal documents while preserving cross-modal relationships
Embedding Strategy: Which embedding models to use for each modality, and how to align them in a shared vector space
Retrieval Architecture: Hybrid retrieval design combining dense vectors + sparse keywords + knowledge graph edges
Context Assembly: How to reconstruct rich context from retrieved chunks before feeding to the LLM
Evaluation Framework: Metrics and test cases for measuring retrieval quality and answer faithfulness across modalities

For each component, provide recommended open-source tools, key configuration parameters, common pitfalls, and a minimal working code snippet in Python.

Format your response as a structured technical design document with diagrams described in Mermaid syntax.