Skip to content

YunfanGoForIt/DART

Repository files navigation

DART - DeepWiki Auto Refinement & Throw

DART - DeepWiki 自动精炼与投递

DART Banner

An automated pipeline that monitors GitHub Stars, fetches technical documentation via DeepWiki MCP, refines content using a RAG (Retrieval-Augmented Generation) workflow, and publishes polished Chinese wikis to Feishu (Lark). 一个自动化流水线,用于监控 GitHub 星标仓库,通过 DeepWiki MCP 获取技术文档,利用 RAG(检索增强生成)工作流优化内容,并将高质量的中文 Wiki 投递发布至飞书(Lark)。

🎯 Why DART?

  • DeepWiki Auto Refinement & Throw

  • 把精炼好的知识像飞镖一样精准"投掷"到飞书

  • 解决Deepwiki过于冗长繁杂,英语阅读不便的困扰,采用“AI工作流精炼+主动推送”的方式,用户可以利用飞书及时收到一篇可以快速阅读、了解星标仓库特点的推送。集成RAG的工作流确保推送在可碎片化高效阅读的同时,不失关键技术细节


🚀 Key Features

🚀 核心功能

  • GitHub Star Monitoring: Automatically tracks new starred repositories every 60 seconds. GitHub 星标监控:每 60 秒自动检测新增的星标仓库。

  • Intelligent Documentation Fetching: 智能文档获取

    • Seamless integration with DeepWiki MCP. 无缝集成 DeepWiki MCP
    • Lightweight Indexing Trigger: Automatically requests DeepWiki to index new repositories using Next.js Server Actions (Memory-efficient, 2G RAM friendly). 轻量级索引触发器:通过 Next.js Server Actions 自动请求 DeepWiki 对新仓库建立索引(内存高效,仅需 2GB RAM)。
  • RAG-Powered Refinement: RAG 驱动的内容精炼

    • Phase 1: Smart Drafting: Generates a structured draft in Chinese with RAG placeholders (<!-- NEED_RAG -->) for complex concepts. 第一阶段:智能草稿生成:生成结构化的中文初稿,并为复杂概念插入 RAG 占位符(<!-- NEED_RAG -->)。
    • Phase 2: Targeted Embedding: Uses LLM to select only the most relevant documents for embedding, saving significant API costs and local resources. 第二阶段:定向嵌入:利用大语言模型(LLM)仅选择最相关的文档进行向量化,大幅节省 API 调用成本和本地资源。
    • Phase 3: Deep Expansion: Context-aware expansion of placeholders using vector search and background knowledge. 第三阶段:深度扩展:基于向量检索与背景知识,对占位符进行上下文感知的扩展填充。
  • Feishu (Lark) Integration: 飞书(Lark)集成

    • Automatic creation of Wiki nodes. 自动创建 Wiki 页面节点。
    • Updates document title and content. 更新文档标题和内容。
    • Formats content into high-quality Feishu Docx blocks (Titles, Code Blocks, Lists). 将内容格式化为高质量的飞书 Docx 区块(标题、代码块、列表等)。
    • Real-time notifications via Feishu Webhook (Card and Text messages). 通过飞书 Webhook 实时推送通知(卡片消息与文本消息)。
  • Local Document Backup: Saves final refined documents locally for easy sync with tools like Syncthing. 本地文档备份:保存最终精炼文档到本地,方便使用 Syncthing 等工具同步。

  • Status Dashboard: A clean, FastAPI-powered web UI to monitor the processing status of all repositories. 状态仪表盘:基于 FastAPI 的简洁 Web 界面,用于监控所有仓库的处理状态。


🏗️ Implementation Logic

🏗️ 实现逻辑

graph TD
    A[GitHub Stars] -->|Monitor| B(DART Workflow)
    B -->|Check Local/MCP| C{Data Exists?}
    C -->|No| D[DeepWiki Indexer Trigger]
    D -->|Wait 10m| B
    C -->|Yes| E[RAG Refiner]

    subgraph RAG Workflow
    E --> E1[Draft Gen + RAG Tags]
    E1 --> E2[LLM Document Selector]
    E2 --> E3[Selective Embedding]
    E3 --> E4[Vector Search & Expansion]
    end

    E4 --> F[Local Backup]
    F --> G[Feishu Service]
    G -->|Create/Update| H[Feishu Wiki]
    G -->|Notify| I[Feishu Webhook]
Loading
  1. Discovery: GitHubMonitor polls the user's starred repositories. 发现阶段GitHubMonitor 定期轮询用户的 GitHub 星标仓库。

  2. Acquisition: mcp_client attempts to fetch Markdown pages. If the repo isn't indexed, DeepWikiIndexer submits a direct HTTP POST (Server Action) to DeepWiki to trigger indexing. 获取阶段mcp_client 尝试拉取 Markdown 页面。若仓库未被索引,DeepWikiIndexer 会通过 HTTP POST(Server Action)直接请求 DeepWiki 启动索引。

  3. Refinement: RAGRefiner processes the "Overview" page. It identifies "blind spots" that need more info, selects relevant supplementary files from the repo, embeds them into a local vector store, and performs a final rewrite to ensure a high-quality, architecture-focused Chinese document. 精炼阶段RAGRefiner 处理"概览"页面,识别需要补充信息的"盲点",从仓库中选取相关辅助文件,将其嵌入本地向量数据库,并最终重写为高质量、聚焦架构的中文文档。

  4. Backup: Saves the final refined document to final_docs/ folder for local backup and sync. 备份阶段:将最终精炼文档保存到 final_docs/ 文件夹,用于本地备份和同步。

  5. Publishing: FeishuService maps the Markdown to Feishu's Block API, updates title and content, and handles notifications. 发布阶段FeishuService 将 Markdown 映射为飞书 Block API 格式,更新文档标题和内容,并发送通知推送。


🛠️ Deployment

🛠️ 部署指南

1. Prerequisites

1. 前置条件

  • Python 3.10+
  • 至少 2GB 内存的服务器(已针对低资源环境优化)
  • GitHub Personal Access Token
  • 具备 Wiki 和 Docx 权限的飞书应用
  • OpenAI 兼容的 API 密钥(如 DeepSeek、Qwen、Zhipu 或 OpenAI)

2. Installation

2. 安装步骤

git clone https://github.com/YunfanGoForIt/DART.git
cd dart
pip install -r requirements.txt

3. Configuration

3. 配置说明

在项目根目录创建 .env 文件:

GITHUB_TOKEN=your_github_token
FEISHU_APP_ID=your_app_id
FEISHU_APP_SECRET=your_app_secret
FEISHU_SPACE_ID=your_wiki_space_id
FEISHU_WEBHOOK_URL=your_webhook_url

OPENAI_API_KEY=your_llm_api_key
OPENAI_BASE_URL=https://api.your-provider.com/v1
OPENAI_MODEL=gpt-4o # or qwen-max, etc.
EMBEDDING_MODEL=text-embedding-v3

USER_EMAIL=your_email@example.com # Used for DeepWiki indexing requests

4. Running the Application

4. 启动应用

使用 Screen(推荐用于长期运行)

screen -S dart
python3 main.py
# 按 Ctrl+A,再按 D 退出会话

使用 Nohup

nohup python3 main.py > dart.log 2>&1 &

仪表盘将可通过 http://your-server-ip:8002 访问。

5. Output Structure

5. 输出结构

dart/
├── output/           # Raw documents from DeepWiki MCP
│   └── {repo_name}/  # Original markdown files
└── final_docs/       # Final refined documents ⭐
    └── {repo_name}.md  # Polished AI-generated Chinese wiki

📝 License

📝 许可证

MIT License. Feel free to use and contribute! MIT 许可证。欢迎使用与贡献!


🎯 DART - Throw Your Knowledge to Feishu!

🎯 DART - 把你的知识投递到飞书!

About

An automated pipeline that monitors GitHub Stars, fetches technical documentation via DeepWiki MCP, refines content using a RAG (Retrieval-Augmented Generation) workflow, and publishes polished Chinese wikis to Feishu (Lark). 一个自动化流水线,用于监控 GitHub 星标仓库,通过 DeepWiki MCP 获取技术文档,利用 RAG(检索增强生成)工作流优化内容,并将高质量的中文 Wiki 发布至飞书(Lark)。

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors