Skip to content

Bug: validate_all checks evolved_body instead of evolved_full, causing all evolved skills to fail skill_structure #34

@wl-lgtm

Description

@wl-lgtm

Bug Summary

In evolution/skills/evolve_skill.py, the evolved skill is reassembled with frontmatter on line 185, but the validation on line 189 checks the body-only text. This causes the skill_structure constraint to falsely fail every evolved skill because evolved_body has no YAML frontmatter.

Reproduction

Run any skill evolution (e.g. plan, github-code-review, systematic-debugging). The optimizer completes successfully, but validation fails with:

✗ skill_structure: Skill missing: YAML frontmatter (---), name field, description field

Meanwhile, the saved evolved_FAILED.md contains valid frontmatter:

---
name: plan
description: Plan mode for Hermes — inspect context, write a markdown plan...
---

Root Cause

Line 185 correctly assembles the full skill:

evolved_full = reassemble_skill(skill["frontmatter"], evolved_body)

Line 189 incorrectly validates the body only:

evolved_constraints = validator.validate_all(evolved_body, "skill", baseline_text=skill["body"])

Fix

Change line 189 from evolved_body to evolved_full:

evolved_constraints = validator.validate_all(evolved_full, "skill", baseline_text=skill["body"])

Evidence

I hit this on 5 consecutive skill evolutions. All had valid frontmatter, all scored well (57-67%), all were falsely rejected:

Skill Score Rejected By
systematic-debugging 66.47% skill_structure
github-code-review 58.45% skill_structure
plan 63.66% skill_structure
llm-wiki 57.87% skill_structure + size_limit*
test-driven-development 67.58% skill_structure

*llm-wiki also genuinely exceeded size, but skill_structure was the first blocker.

Environment

  • Commit: c763e85 (latest main as of 2026-04-22)
  • Python 3.11
  • DSPy 2.x

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions