Skip to content

[Feature Request]: Feature Request: Hierarchical Retrieval Architecture for Production-Grade RAG #11610

@TeslaZY

Description

@TeslaZY

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

Is your feature request related to a problem?

Describe the feature you'd like

Problem Statement: Bridging the "Demo-to-Production" Gap

RAGflow currently demonstrates strong performance in proof-of-concept (PoC) scenarios. However, when deployed in production environments with diverse knowledge bases and large-scale document collections (tens of thousands of documents), the existing "single-layer retrieval" architecture—which flattens all document chunks into a single vector search space—reveals significant limitations in both accuracy and efficiency.

Key Challenges:

  1. Chunk Fragmentation Issues

    • Context Fragmentation: Improper segmentation disrupts natural semantic units, resulting in incomplete information within individual chunks and degraded semantic representation.
    • Information Dilution: Critical information ("gold nuggets") is often split across multiple chunks, making comprehensive retrieval challenging and reducing answer quality.
  2. Embedding Model Limitations

    • Theoretical Constraints: As established in research papers like "On the Theoretical Limitations of Embedding-Based Retrieval" the dimensionality of embedding vectors fundamentally limits the number of "document-query" relevance relationships that can be perfectly represented.
    • Practical Bottlenecks: Commonly deployed private embedding models (e.g., qwen3-embedding-0.6B, jina-embeddings-v3 with 1024 dimensions) may lack sufficient capacity to encode complex semantic relationships at scale. While higher-dimensional models (4096/8196 dim) exist, they impose prohibitive hardware requirements and computational costs for private deployments.
    • Retrieval Precision Degradation: Direct vector search across millions of chunks becomes computationally expensive and prone to vector space "crowding" and "confusion," causing relevant chunks to rank lower.
  3. Underutilized Metadata

    • Valuable document metadata (department, author, date, document type, etc.) remains largely untapped as systematic pre-retrieval filters, wasting crucial structured information.

Proposed Solution: Three-Tier Retrieval Architecture

Inspired by search engine hierarchical principles, we propose a Knowledge Base → Document → Chunk three-tier retrieval architecture to progressively narrow the search scope and enhance both precision and efficiency.

Tier 1: Knowledge Base Routing

  • Function: Automatically routes user queries to the most relevant knowledge base based on intent.
  • Implementation:
    • Support independent retrieval parameters per knowledge base (vector/keyword weights, recall thresholds).
    • Enable dynamic routing via rule-based or LLM-based approaches to ensure domain-specific processing.

Tier 2: Document Filtering

  • Function: Applies document-level metadata filtering within selected knowledge bases to identify relevant document subsets.
  • Enhancements:
    • Intelligent Metadata Filtering: In Auto mode, allow users to specify key metadata fields (e.g., document type, department) with LLM-generated filter conditions to avoid high-cardinality metadata interference.
    • Metadata Similarity Matching: Introduce similarity operators for text-based metadata (document names, summaries) to support fuzzy matching.
    • Enhanced Metadata Generation: Strengthen Data Pipeline capabilities for full-text metadata and summary generation to enrich document filtering context.
    • Efficient metadata management function: batch CRUD of metadata;metadata management UI.

Tier 3: Chunk Refinement

  • Function: Performs precise vector retrieval at the chunk level within the filtered document set.
  • Enhancements:
    • Parent-Child Chunking with Summary Mapping: Enable creation of parent-level summaries for contextually related chunks. Retrieval first matches macro-themes via summary vectors, then maps to original chunks for details—combining semantic robustness with granular information access.
    • Customizable Prompts: Allow users to configure custom prompts for chunk keyword extraction and question generation tasks to better align with domain-specific semantics.

Complementary Data Pipeline Enhancements

  • Data Pipeline can work as a complementary enhancement to Build-in Methods, not only a replacement.
  • Focus on strengthening full-text metadata generation and document-level summarization capabilities to provide robust data foundation for hierarchical retrieval.

Expected Benefits

Implementing this hierarchical retrieval architecture will enable RAGflow's critical transition from "feasible" to "production-ready":

  1. Improved Recall Precision: Layered filtering effectively focuses on relevant regions, reducing interference from irrelevant chunks and fundamentally addressing embedding model limitations.
  2. Optimized System Performance: Significantly reduces vector search candidate sets, lowering computational overhead and improving response latency.
  3. Enhanced System Intelligence & Flexibility: Knowledge base routing and intelligent metadata filtering enable better understanding of user intent and adaptation to complex production environments.
  4. Reduced Operational Costs: Template-based, batch-enabled metadata management tools minimize maintenance overhead.

Implementation Priority

High - This architecture addresses fundamental scalability and precision limitations critical for production deployments.

Describe implementation you've considered

No response

Documentation, adoption, use case

Additional information

No response

Metadata

Metadata

Assignees

Labels

💞 featureFeature request, pull request that fullfill a new feature.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions