[Feature Request / Enhancement] Reflection items lack resolution/invalidation mechanism after underlying problem is solved

**Classification: Feature Request (Enhancement), not a bug.**
The reflection pipeline works as designed. This issue proposes adding a missing capability: the ability to mark reflection items as resolved once their underlying problem is solved, preventing stale items from continuing to be injected into new sessions.

---

## Summary

Reflection items (both `invariant` and `derived` kinds) stored in LanceDB have no mechanism to be marked as resolved or superseded. Once a problem is solved in a later session, the reflection system continues to inject stale `derived` lessons into every new session's context until the item naturally decays past `maxAgeDays`. This causes the agent to repeatedly suggest "next steps" for problems that have already been resolved.

## Impact

- **User experience**: The agent keeps injecting stale `derived-focus` and `inherited-rules` entries about problems that are no longer relevant, which confuses the agent and annoys the user.
- **Context pollution**: 6+ stale derived items occupy precious injection budget (topK=6 default) and block relevant items from being injected.
- **Trust erosion**: When the agent repeatedly advises to "run a contrastive retrieval test" after the user has already confirmed rerank is working, it signals to the user that the memory system doesn't understand state changes.

## Current Behavior

The reflection pipeline is **unidirectional**: `extract → store → decay → inject`. There is no `resolve → invalidate → suppress` path.

### Detailed pipeline walkthrough

1. **Storage** (`reflection-item-store.ts`): Reflection items are written to LanceDB with metadata type `memory-reflection-item`. Fields include `itemKind`, `decayMidpointDays`, `decayK`, `baseWeight`, `quality`, `storedAt`, `sessionId`. **No `status` field exists.**

2. **Scoring** (`reflection-ranking.ts`): Items are scored using a logistic decay function:
   ```
   score = logistic(ageDays, midpointDays, k) * baseWeight * quality
   ```
   For `derived` items: midpoint=7 days, k=0.65, quality=0.95. This means items retain >50% of their score for the first 7 days.

3. **Filtering** (`reflection-recall.ts`):
   - `filterByMaxAge`: Removes items older than `maxAgeDays` (default 45 for invariant, configurable for derived). Items are kept alive for up to 45 days by default.
   - `keepMostRecentPerNormalizedKey`: Caps items per strictKey within the age window.
   - **No resolution check. No cross-reference with current context.**

4. **Aggregation** (`reflection-aggregation.ts`): Groups items by strictKey, computes support/freshness/stability/quality scores, picks representative. **No check for whether the underlying problem has been resolved.**

5. **Selection** (`reflection-selection.ts`): Diversity-aware selection with soft-key deduplication. **No resolution filtering.**

6. **Injection** (`index.ts`, `before_prompt_build`): Two independent paths inject reflection content:
   - `derived-focus` block: from `buildReflectionDerivedFocusBlock()`
   - `inherited-rules` block: from `orchestrateDynamicRecall()`
   
   **Neither path checks if items are stale due to problem resolution.**

### Existing safeguards (and why they don't help)

| Mechanism | What it does | Why it doesn't solve this |
|---|---|---|
| Logistic decay | Items score decrease over time | `derived` midpoint is 7 days; items stay high-score for a week. `maxAgeDays` defaults to 45. |
| Repeated-injection guard (`recall-engine.ts`) | Prevents re-injecting same item within N turns | Only works **within the same session**, not across sessions. |
| `autoRecallExcludeReflection` (default: true) | Keeps reflection items out of auto-recall path | Confirms the two paths are independent; no cross-path suppression possible. |
| Same-key penalty in final selection (`final-topk-setwise-selection.ts`) | Penalizes duplicate key within same turn | Only applies to items within the same path, same turn. |

### Cross-module analysis

I checked every module in the pipeline for resolution mechanisms:

| Module | Has resolution/invalidation? |
|---|---|
| `reflection-item-store.ts` | ❌ No `status` field |
| `reflection-ranking.ts` | ❌ Only computes decay score |
| `reflection-recall.ts` | ❌ Only filters by age |
| `reflection-aggregation.ts` | ❌ Only groups by key |
| `reflection-selection.ts` | ❌ Only diversity selection |
| `recall-engine.ts` | ❌ Only per-session dedup |
| `adaptive-retrieval.ts` | ❌ Only skips greetings/commands |
| `noise-filter.ts` | ❌ Only filters refusals/meta-questions |
| `final-topk-setwise-selection.ts` | ❌ Only intra-path dedup |
| `index.ts` (`before_prompt_build`) | ❌ No cross-path suppression |

**No module in the entire pipeline provides a mechanism to invalidate or suppress resolved reflection items.**

## Reproduction Steps

1. Session A: Encounter a problem (e.g., rerank misconfiguration). Plugin reflection extracts 6 derived lessons about the problem.
2. Session B: Solve the problem. Confirm it's working.
3. Session C, D, E...: The 6 stale derived lessons are still injected as `<derived-focus>`. Agent is misled into suggesting "the next useful action is a contrastive retrieval test" even though rerank is already resolved.
4. Stale items persist for up to `maxAgeDays` (default 45 for invariant, configurable for derived).

## Proposed Solutions (in order of implementation effort)

### Option A (Minimal): `self_improvement_resolve` tool
- Add a new agent tool: `self_improvement_reflection_resolve(query | id)`
- Marks matching reflection items as `resolved` (adds a `resolvedAt` timestamp to metadata)
- `reflection-recall.ts` skips items where `resolvedAt` is set
- **Effort**: Low. Touches `tools.ts`, `reflection-item-store.ts`, `reflection-recall.ts`.
- **Tradeoff**: Requires the agent/user to know to call the tool. Not automatic.

### Option B (Medium): Cross-pipeline suppression via memory signals
- When a new memory entry is stored (via `memory_store`) that semantically contradicts a reflection item, automatically discount that reflection item.
- Implementation: During reflection recall scoring, check if any stored memory (from the auto-recall pipeline) has high semantic similarity to a reflection item but with opposite intent (e.g., "rerank is working" vs "rerank needs contrastive test"). If so, apply an additional decay multiplier.
- **Effort**: Medium. Requires cross-referencing between the two pipelines.
- **Tradeoff**: Needs good classification of "contradictory" vs "supporting" signals.

### Option C (Full): `superseded` status + lifecycle management
- Add `status` field to reflection items: `active | resolved | superseded`
- Add `self_improvement_reflection_supersede(strictKey, reason)` tool
- New reflection items with the same `strictKey` automatically mark older items as `superseded`
- Reflection recall only returns items with `status === 'active'`
- **Effort**: Medium-High. Touches store, scoring, injection, tools.
- **Tradeoff**: Most robust solution, but adds complexity to the metadata schema.

## Environment

- memory-lancedb-pro version: 1.1.0-beta.6
- OpenClaw version: 2026.3.22+
- sessionStrategy: `memoryReflection`
- memoryReflection.injectMode: `inheritance+derived`
- memoryReflection.recall.mode: `dynamic`

## Source Files Referenced

- `src/reflection-item-store.ts` — item metadata and decay defaults
- `src/reflection-ranking.ts` — logistic decay scoring
- `src/reflection-recall.ts` — dynamic recall ranking
- `src/reflection-aggregation.ts` — group aggregation and scoring
- `src/reflection-selection.ts` — diversity-aware selection
- `src/recall-engine.ts` — repeated-injection guard and age filtering
- `src/adaptive-retrieval.ts` — query skip logic
- `src/noise-filter.ts` — content quality filtering
- `src/final-topk-setwise-selection.ts` — final top-k selection (shared by both paths)
- `index.ts` — `before_prompt_build` hook and injection orchestration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request / Enhancement] Reflection items lack resolution/invalidation mechanism after underlying problem is solved #395

Summary

Impact

Current Behavior

Detailed pipeline walkthrough

Existing safeguards (and why they don't help)

Cross-module analysis

Reproduction Steps

Proposed Solutions (in order of implementation effort)

Option A (Minimal): `self_improvement_resolve` tool

Option B (Medium): Cross-pipeline suppression via memory signals

Option C (Full): `superseded` status + lifecycle management

Environment

Source Files Referenced

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mechanism	What it does	Why it doesn't solve this
Logistic decay	Items score decrease over time	`derived` midpoint is 7 days; items stay high-score for a week. `maxAgeDays` defaults to 45.
Repeated-injection guard (`recall-engine.ts`)	Prevents re-injecting same item within N turns	Only works within the same session, not across sessions.
`autoRecallExcludeReflection` (default: true)	Keeps reflection items out of auto-recall path	Confirms the two paths are independent; no cross-path suppression possible.
Same-key penalty in final selection (`final-topk-setwise-selection.ts`)	Penalizes duplicate key within same turn	Only applies to items within the same path, same turn.

Module	Has resolution/invalidation?
`reflection-item-store.ts`	❌ No `status` field
`reflection-ranking.ts`	❌ Only computes decay score
`reflection-recall.ts`	❌ Only filters by age
`reflection-aggregation.ts`	❌ Only groups by key
`reflection-selection.ts`	❌ Only diversity selection
`recall-engine.ts`	❌ Only per-session dedup
`adaptive-retrieval.ts`	❌ Only skips greetings/commands
`noise-filter.ts`	❌ Only filters refusals/meta-questions
`final-topk-setwise-selection.ts`	❌ Only intra-path dedup
`index.ts` (`before_prompt_build`)	❌ No cross-path suppression

[Feature Request / Enhancement] Reflection items lack resolution/invalidation mechanism after underlying problem is solved #395

Description

Summary

Impact

Current Behavior

Detailed pipeline walkthrough

Existing safeguards (and why they don't help)

Cross-module analysis

Reproduction Steps

Proposed Solutions (in order of implementation effort)

Option A (Minimal): self_improvement_resolve tool

Option B (Medium): Cross-pipeline suppression via memory signals

Option C (Full): superseded status + lifecycle management

Environment

Source Files Referenced

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Option A (Minimal): `self_improvement_resolve` tool

Option C (Full): `superseded` status + lifecycle management