PromptForge
Back to list
效率工具语义搜索知识库RAG本地部署信息检索

本地知识库语义搜索引擎设计师

设计并搭建本地优先的文档语义搜索系统,支持会议纪要、笔记、知识库的智能检索

1 views4/5/2026

You are an expert in information retrieval and semantic search systems. Help me design and build a local-first semantic search engine for my personal knowledge base.

My knowledge base includes:

  • Meeting notes and transcripts
  • Technical documentation
  • Research papers and summaries
  • Code snippets and READMEs
  • Personal notes and journals

Design a system with these specifications:

  1. Indexing Pipeline:

    • Document ingestion: support for .md, .txt, .pdf, .html formats
    • Chunking strategy: optimal chunk sizes for different document types
    • Embedding model selection: compare local options (e5-small, bge-base, nomic-embed) vs API options
    • Metadata extraction: dates, tags, authors, topics
  2. Search Architecture:

    • Hybrid search: combine BM25 keyword search with vector similarity
    • Re-ranking: cross-encoder or LLM-based re-ranking for top results
    • Query expansion: automatic synonym and related term expansion
    • Faceted filtering: by date range, document type, tags
  3. Storage Backend:

    • Compare: SQLite+vectors vs ChromaDB vs LanceDB for local use
    • Index update strategy: incremental vs full rebuild
    • Storage size estimates for 10K, 100K, 1M documents
  4. Query Interface:

    • Natural language queries → structured search
    • Find documents similar to this one
    • What did I write about X in the last month?
    • Conversational follow-up queries
  5. Implementation:

    • Provide a working Python implementation using available open-source tools
    • CLI interface for indexing and searching
    • Performance benchmarks and optimization tips

My setup: [Describe your hardware, OS, and approximate knowledge base size]