An automated pipeline that monitors GitHub Stars, fetches technical documentation via DeepWiki MCP, refines content using a RAG (Retrieval-Augmented Generation) workflow, and publishes polished Chinese wikis to Feishu (Lark). 一个自动化流水线,用于监控 GitHub 星标仓库,通过 DeepWiki MCP 获取技术文档,利用 RAG(检索增强生成)工作流优化内容,并将高质量的中文 Wiki 投递发布至飞书(Lark)。
-
DeepWiki Auto Refinement & Throw
-
把精炼好的知识像飞镖一样精准"投掷"到飞书
-
解决Deepwiki过于冗长繁杂,英语阅读不便的困扰,采用“AI工作流精炼+主动推送”的方式,用户可以利用飞书及时收到一篇可以快速阅读、了解星标仓库特点的推送。集成RAG的工作流确保推送在可碎片化高效阅读的同时,不失关键技术细节。
-
GitHub Star Monitoring: Automatically tracks new starred repositories every 60 seconds. GitHub 星标监控:每 60 秒自动检测新增的星标仓库。
-
Intelligent Documentation Fetching: 智能文档获取:
- Seamless integration with DeepWiki MCP. 无缝集成 DeepWiki MCP。
- Lightweight Indexing Trigger: Automatically requests DeepWiki to index new repositories using Next.js Server Actions (Memory-efficient, 2G RAM friendly). 轻量级索引触发器:通过 Next.js Server Actions 自动请求 DeepWiki 对新仓库建立索引(内存高效,仅需 2GB RAM)。
-
RAG-Powered Refinement: RAG 驱动的内容精炼:
- Phase 1: Smart Drafting: Generates a structured draft in Chinese with RAG placeholders (
<!-- NEED_RAG -->) for complex concepts. 第一阶段:智能草稿生成:生成结构化的中文初稿,并为复杂概念插入 RAG 占位符(<!-- NEED_RAG -->)。 - Phase 2: Targeted Embedding: Uses LLM to select only the most relevant documents for embedding, saving significant API costs and local resources. 第二阶段:定向嵌入:利用大语言模型(LLM)仅选择最相关的文档进行向量化,大幅节省 API 调用成本和本地资源。
- Phase 3: Deep Expansion: Context-aware expansion of placeholders using vector search and background knowledge. 第三阶段:深度扩展:基于向量检索与背景知识,对占位符进行上下文感知的扩展填充。
- Phase 1: Smart Drafting: Generates a structured draft in Chinese with RAG placeholders (
-
Feishu (Lark) Integration: 飞书(Lark)集成:
- Automatic creation of Wiki nodes. 自动创建 Wiki 页面节点。
- Updates document title and content. 更新文档标题和内容。
- Formats content into high-quality Feishu Docx blocks (Titles, Code Blocks, Lists). 将内容格式化为高质量的飞书 Docx 区块(标题、代码块、列表等)。
- Real-time notifications via Feishu Webhook (Card and Text messages). 通过飞书 Webhook 实时推送通知(卡片消息与文本消息)。
-
Local Document Backup: Saves final refined documents locally for easy sync with tools like Syncthing. 本地文档备份:保存最终精炼文档到本地,方便使用 Syncthing 等工具同步。
-
Status Dashboard: A clean, FastAPI-powered web UI to monitor the processing status of all repositories. 状态仪表盘:基于 FastAPI 的简洁 Web 界面,用于监控所有仓库的处理状态。
graph TD
A[GitHub Stars] -->|Monitor| B(DART Workflow)
B -->|Check Local/MCP| C{Data Exists?}
C -->|No| D[DeepWiki Indexer Trigger]
D -->|Wait 10m| B
C -->|Yes| E[RAG Refiner]
subgraph RAG Workflow
E --> E1[Draft Gen + RAG Tags]
E1 --> E2[LLM Document Selector]
E2 --> E3[Selective Embedding]
E3 --> E4[Vector Search & Expansion]
end
E4 --> F[Local Backup]
F --> G[Feishu Service]
G -->|Create/Update| H[Feishu Wiki]
G -->|Notify| I[Feishu Webhook]
-
Discovery:
GitHubMonitorpolls the user's starred repositories. 发现阶段:GitHubMonitor定期轮询用户的 GitHub 星标仓库。 -
Acquisition:
mcp_clientattempts to fetch Markdown pages. If the repo isn't indexed,DeepWikiIndexersubmits a direct HTTP POST (Server Action) to DeepWiki to trigger indexing. 获取阶段:mcp_client尝试拉取 Markdown 页面。若仓库未被索引,DeepWikiIndexer会通过 HTTP POST(Server Action)直接请求 DeepWiki 启动索引。 -
Refinement:
RAGRefinerprocesses the "Overview" page. It identifies "blind spots" that need more info, selects relevant supplementary files from the repo, embeds them into a local vector store, and performs a final rewrite to ensure a high-quality, architecture-focused Chinese document. 精炼阶段:RAGRefiner处理"概览"页面,识别需要补充信息的"盲点",从仓库中选取相关辅助文件,将其嵌入本地向量数据库,并最终重写为高质量、聚焦架构的中文文档。 -
Backup: Saves the final refined document to
final_docs/folder for local backup and sync. 备份阶段:将最终精炼文档保存到final_docs/文件夹,用于本地备份和同步。 -
Publishing:
FeishuServicemaps the Markdown to Feishu's Block API, updates title and content, and handles notifications. 发布阶段:FeishuService将 Markdown 映射为飞书 Block API 格式,更新文档标题和内容,并发送通知推送。
- Python 3.10+
- 至少 2GB 内存的服务器(已针对低资源环境优化)
- GitHub Personal Access Token
- 具备 Wiki 和 Docx 权限的飞书应用
- OpenAI 兼容的 API 密钥(如 DeepSeek、Qwen、Zhipu 或 OpenAI)
git clone https://github.com/YunfanGoForIt/DART.git
cd dart
pip install -r requirements.txt在项目根目录创建 .env 文件:
GITHUB_TOKEN=your_github_token
FEISHU_APP_ID=your_app_id
FEISHU_APP_SECRET=your_app_secret
FEISHU_SPACE_ID=your_wiki_space_id
FEISHU_WEBHOOK_URL=your_webhook_url
OPENAI_API_KEY=your_llm_api_key
OPENAI_BASE_URL=https://api.your-provider.com/v1
OPENAI_MODEL=gpt-4o # or qwen-max, etc.
EMBEDDING_MODEL=text-embedding-v3
USER_EMAIL=your_email@example.com # Used for DeepWiki indexing requests使用 Screen(推荐用于长期运行):
screen -S dart
python3 main.py
# 按 Ctrl+A,再按 D 退出会话使用 Nohup:
nohup python3 main.py > dart.log 2>&1 &仪表盘将可通过 http://your-server-ip:8002 访问。
dart/
├── output/ # Raw documents from DeepWiki MCP
│ └── {repo_name}/ # Original markdown files
└── final_docs/ # Final refined documents ⭐
└── {repo_name}.md # Polished AI-generated Chinese wiki
MIT License. Feel free to use and contribute! MIT 许可证。欢迎使用与贡献!
