Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,12 @@
## Contents

- [What this is](#what-this-is)
- [Where this fits in DASH](#where-this-fits-in-dash)
- [📊 The headline result](#-the-headline-result)
- [🏗 Architecture](#-architecture)
- [⚡ Quick start](#-quick-start)
- [Status](#status)
- [Related repos](#related-repos)
- [📁 Directory layout](#-directory-layout)
- [🧩 Retrieval API](#-retrieval-api)
- [📚 Research notes](#-research-notes)
Expand Down Expand Up @@ -69,6 +72,22 @@ This repository exists for four jobs:

---

## Where this fits in DASH

Evensong is the DASH workbench layer. DashPersona proves persona intelligence;
Evensong shows what happens when that same operating discipline is applied to
agent runtimes: retrieval evidence, runnable harnesses, inspectable modules,
and claims that can be checked against committed artifacts.

It sits between public product surfaces and lower-level memory infrastructure.
Research Vault provides durable knowledge storage, Windburn names state-hygiene
objects, and Multica Ultimate Workbench coordinates agents and review gates.
Evensong is the repo where the agent workbench itself becomes readable.

<p align="right"><a href="#contents">↑ back to top</a></p>

---

## 📊 The headline result

**Four formal retrieval artifacts** are committed under [`benchmarks/runs/`](./benchmarks/runs): Wave 3+F/G cover the original 108-query cross-LLM design; Wave 3+I is the newer 24-query adversarial suite for dense stage-1 + RAR. The Wave 3+I claim is scoped to that hard suite and does **not** replace the broader 108-query F/G comparisons.
Expand Down Expand Up @@ -226,6 +245,25 @@ bun run scripts/benchmark-hybrid-scale.ts --runs=3 --with-body \

---

## Status

Evensong is active and public-facing, but its strongest claims are deliberately
bounded to the committed benchmark and retrieval artifacts in this repository.
Treat it as a research-grade workbench and evidence surface, not a universal
retrieval leaderboard or a polished hosted product.

## Related repos

- [dash-persona](https://github.com/Fearvox/dash-persona) — origin project and persona intelligence layer.
- [multica-ultimate-workbench](https://github.com/Fearvox/multica-ultimate-workbench) — multi-agent orchestration and review-gate operating memory.
- [project-windburn](https://github.com/Fearvox/project-windburn) — state hygiene and cognitive-cache direction.
- [dash-research-vault](https://github.com/Fearvox/dash-research-vault) — durable research memory and benchmark substrate.
- [dash-design-infra](https://github.com/Fearvox/dash-design-infra) — design constraints and public documentation surface.

<p align="right"><a href="#contents">↑ back to top</a></p>

---

## 📁 Directory layout

```
Expand Down
Loading