Runchuan-BU · erik-mingyang · Apr 4, 2026 · Apr 4, 2026
diff --git a/container/skills/README.md b/container/skills/README.md
@@ -13,6 +13,15 @@ Anything the WhatsApp / WeCom / Discord / local-web agent should use belongs **h
 | `sds-gel-review/` | SDS gel image review |
 | `query-*` | Database / API usage (Ensembl, UniProt, KEGG, …) |
 | `blast-search/`, `pubmed-search/`, `sequence-analysis/` | Literature & sequence workflows |
+| `bio-manuscript-*` | Community-contributed manuscript planning pipeline for idea screening, figure planning, manuscript drafting, refinement, and implementation blueprints |
+| `bio-manuscript-common/` | Shared templates and helper scripts used by the manuscript pipeline skill family |
+
+## Community Skills
+
+Some runtime skills may be integrated from BioClaw community contributors when they prove useful in real workflows. The manuscript pipeline skill family currently staged here is integrated as a community-contributed workflow.
+
+Contributor reference:
+- Yuhong Dong, Westlake University PhD candidate, BioClaw community contributor
 
 ## Developer-only skills
 

diff --git a/container/skills/bio-analysis-system/SKILL.md b/container/skills/bio-analysis-system/SKILL.md
@@ -0,0 +1,161 @@
+# bio-analysis-system
+
+**Step 5: Analysis system design (分析方法体系构建)**
+
+Build the analysis layer for the manuscript by identifying which analyses, tools, and biological validations should support each figure and each task.
+
+## Purpose
+
+1. Extract analysis patterns from related work
+2. Borrow useful analyses from adjacent domains when needed
+3. Map analyses to BioClaw-compatible tools or fallback software
+4. Explain why each analysis is included and what biological claim it supports
+5. Connect analyses to figure panels
+
+## Input Format
+
+```text
+topic: [research topic]
+paper_count: [number of related papers]
+task_system: [task system]
+metric_system: [metric system]
+dataset_catalog: [dataset catalog]
+```
+
+## Workflow
+
+### Step 5.1: Extract analyses from existing work
+
+If enough related papers exist, inspect their figures and extract:
+
+- panel type
+- analysis method
+- software / package
+- important parameters
+- the scientific or biological conclusion the panel supports
+
+### Step 5.2: Borrow from adjacent fields
+
+If the field is still thin, adapt common analyses from nearby areas such as:
+
+- clustering
+- marker visualization
+- latent embedding visualization
+- pathway enrichment
+- cell-cell communication
+- spatial statistics
+- GRN analysis
+
+### Step 5.3: Categorize analyses
+
+Use three broad groups:
+
+- **Quantitative analyses**
+  - clustering
+  - metric computation
+  - statistical tests
+  - baseline comparisons
+- **Qualitative analyses**
+  - spatial visualization
+  - feature / violin plots
+  - UMAP / t-SNE
+  - before / after alignment comparisons
+  - heatmaps
+- **Biological analyses**
+  - cell annotation
+  - marker genes
+  - pathway enrichment
+  - GRN
+  - ligand-receptor communication
+  - spatial statistics
+  - trajectory analysis
+
+### Step 5.4: Map to BioClaw or fallback tools
+
+Whenever possible, map analysis needs to BioClaw-compatible skills or established tools.
+
+Examples:
+
+- clustering -> Scanpy / Leiden
+- annotation -> CellTypist / SingleR
+- marker plots -> Scanpy
+- enrichment -> gseapy
+- spatial statistics -> squidpy
+- GRN -> pySCENIC
+- communication -> CellChat-like workflow
+
+### Step 5.5: Standardize analysis descriptions
+
+For each analysis, define:
+
+- category
+- purpose
+- biological claim supported
+- preferred tool
+- fallback tool
+- key function
+- recommended parameters
+- inputs / outputs
+- mapped task
+- mapped figure / panel
+
+## Output Format
+
+```markdown
+# Analysis System
+
+## Analysis Sources
+- Extracted from related papers:
+- Borrowed from adjacent domains:
+
+## Quantitative Analyses
+
+### Clustering
+- Category:
+- Purpose:
+- Biological claim supported:
+- Preferred tool:
+- Fallback tool:
+- Key function:
+- Recommended parameters:
+- Inputs / outputs:
+- Relevant tasks:
+- Figure mapping:
+
+### Metric computation
+- Category:
+- Purpose:
+- Preferred tools:
+- Relevant tasks:
+- Figure mapping:
+
+## Qualitative Analyses
+- spatial plot
+- marker / feature plot
+- latent embedding plot
+- heatmap
+- before / after alignment visualization
+
+## Biological Analyses
+- annotation
+- marker recovery
+- pathway enrichment
+- GRN
+- communication
+- trajectory
+
+## Next Step
+- Use the analysis system to design figures in Step 6
+```
+
+## Usage
+
+```bash
+/bio-analysis-system "spatial multi-omics integration | paper_count: 5 | task_system: [...] | metric_system: [...] | dataset_catalog: [...]"
+```
+
+## Notes
+
+1. Prefer analyses that directly support paper claims.
+2. Make the biological readouts visible early; they should not appear only at the very end.
+3. Map each major analysis to a concrete figure panel.
diff --git a/container/skills/bio-dataset-search/SKILL.md b/container/skills/bio-dataset-search/SKILL.md
@@ -0,0 +1,135 @@
+# bio-dataset-search
+
+**Step 3: Dataset search and task matching (数据集搜索与匹配)**
+
+Find suitable datasets for each task and map datasets to the task system defined earlier in the manuscript pipeline.
+
+## Purpose
+
+1. Extract datasets from related papers when possible
+2. Search public repositories directly when needed
+3. Normalize dataset metadata into a common structure
+4. Match datasets to tasks in a defendable way
+
+## Input Format
+
+```text
+topic: [research topic]
+task_system: [task system from Step 2]
+paper_count: [number of related papers]
+existing_papers: [optional list of related papers]
+```
+
+## Workflow
+
+### Step 3.1: Extract datasets from existing work
+
+If `paper_count >= 5`, start from the strongest existing papers.
+
+Read Methods / Data Availability sections and extract:
+
+- dataset name
+- data source
+- platform
+- modality
+- sample scale
+- download path
+- annotation availability
+
+### Step 3.2: Search datasets directly
+
+If there is not enough prior work, search repositories such as:
+
+- GEO
+- ArrayExpress
+- project-specific public portals
+
+Use keyword sets built from:
+
+- topic
+- modality
+- tissue / disease
+- benchmark intent
+
+### Step 3.3: Normalize dataset metadata
+
+For each dataset, record:
+
+- source
+- platform
+- species
+- tissue / disease
+- sample size
+- feature count
+- modalities
+- annotation quality
+- histology / region metadata
+- format
+- preprocessing needs
+- recommended task fit
+
+### Step 3.4: Match datasets to tasks
+
+A good match should satisfy:
+
+1. Every major task has at least one viable dataset
+2. Dataset structure matches the task's technical assumptions
+3. Download remains feasible
+4. Metadata quality is sufficient for evaluation
+5. Prefer at least one backup dataset per important task
+
+## Output Format
+
+```markdown
+# Dataset Catalog
+
+## Data Sources
+- Extracted from related papers:
+- Direct repository search:
+- Borrowed from adjacent domains:
+
+## Dataset Entries
+
+### Dataset 1: [name]
+- Source:
+- Platform:
+- Species:
+- Tissue / disease:
+- Modalities:
+- Sample scale:
+- Annotation quality:
+- Download URL:
+- Format:
+- Recommended tasks:
+- Why it fits:
+
+## Dataset-Task Mapping
+| Task | Recommended dataset | Why it fits | Notes |
+|------|---------------------|-------------|-------|
+| ... | ... | ... | ... |
+
+## Acquisition Notes
+- GEO download hints
+- Public portal download hints
+
+## Preprocessing Recommendations
+| Dataset | Preprocessing needs | Suggested skill / tool |
+|---------|---------------------|------------------------|
+| ... | ... | ... |
+
+## Next Step
+- Build the metric system in Step 4
+```
+
+## Usage
+
+```bash
+/bio-dataset-search "spatial multi-omics integration | paper_count: 5 | task_system: [task system from Step 2]"
+```
+
+## Notes
+
+1. Prefer datasets already used in related work when possible.
+2. Verify links before committing them to the benchmark plan.
+3. Capture QC and annotation metadata whenever available.
+4. Match datasets to tasks based on actual experimental needs, not just popularity.