docs: Clarify validator verification#422
Conversation
✅ Deploy Preview for genlayer-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis PR refines the equivalence principle documentation to clarify consensus validation requirements. It emphasizes that validators must independently verify leader output using non-leader evidence and source-based criteria, clarifies Pattern 4 as "Source-Grounded Non-Comparative Validation," adds explicit security warnings against schema-only checks, and provides concrete examples of both secure and insecure validator patterns with updated API guidance. ChangesValidator Independence and Consensus Validation Clarity
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pages/developers/intelligent-contracts/equivalence-principle.mdx`:
- Line 556: The sentence currently favors "comparative validation" too narrowly;
update the phrasing to prefer "patterns 1-3" instead and broaden the guidance to
note that for tasks like classification, scoring, extraction, authenticity,
safety, ranking, or settlement decisions the default is often an independent
rerun with deterministic field/tolerance comparison (pattern 1–2) or other
pattern 3 approaches rather than only LLM-comparative validation; specifically
replace the clause mentioning "comparative validation" with wording that
recommends "patterns 1–3" as the preferred default and add a brief note that
independent reruns and deterministic checks are often the safer fit when they
can verify decisions from source data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 700332d9-51d0-4844-9644-868bea2cfa04
📒 Files selected for processing (1)
pages/developers/intelligent-contracts/equivalence-principle.mdx
| - **`criteria`** — rules the validator's LLM uses to judge the leader's output against the input data | ||
|
|
||
| **Use when:** the task is subjective (NLP, classification, extraction) and you want validators to judge output quality rather than reproduce it. | ||
| **Use when:** the output is open-ended and validity can be judged against the input/source data without producing a second candidate output. Summaries are the clearest example: many different summaries can be valid, but the validator can still check faithfulness, coverage, hallucinations, and constraints. For classification, scoring, extraction, authenticity, safety, ranking, or settlement decisions, prefer comparative validation unless you can clearly explain how the validator independently verifies the decision from source data. |
There was a problem hiding this comment.
Prefer “patterns 1-3” here, not specifically “comparative validation.”
This sentence is narrower than the rest of the page. For classification, scoring, extraction, and similar decision tasks, the best default is often an independent rerun plus deterministic field/tolerance comparison, not necessarily LLM-comparative validation. As written, this can steer readers away from patterns 1-2 even when they are the safer fit.
✏️ Suggested wording
-**Use when:** the output is open-ended and validity can be judged against the input/source data without producing a second candidate output. Summaries are the clearest example: many different summaries can be valid, but the validator can still check faithfulness, coverage, hallucinations, and constraints. For classification, scoring, extraction, authenticity, safety, ranking, or settlement decisions, prefer comparative validation unless you can clearly explain how the validator independently verifies the decision from source data.
+**Use when:** the output is open-ended and validity can be judged against the input/source data without producing a second candidate output. Summaries are the clearest example: many different summaries can be valid, but the validator can still check faithfulness, coverage, hallucinations, and constraints. For classification, scoring, extraction, authenticity, safety, ranking, or settlement decisions, prefer patterns 1-3 over non-comparative validation. In most cases, validators should independently reproduce or derive the decision, then compare the relevant fields, score buckets, or tolerated ranges.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pages/developers/intelligent-contracts/equivalence-principle.mdx` at line
556, The sentence currently favors "comparative validation" too narrowly; update
the phrasing to prefer "patterns 1-3" instead and broaden the guidance to note
that for tasks like classification, scoring, extraction, authenticity, safety,
ranking, or settlement decisions the default is often an independent rerun with
deterministic field/tolerance comparison (pattern 1–2) or other pattern 3
approaches rather than only LLM-comparative validation; specifically replace the
clause mentioning "comparative validation" with wording that recommends
"patterns 1–3" as the preferred default and add a brief note that independent
reruns and deterministic checks are often the safer fit when they can verify
decisions from source data.
Description
Clarifies the Equivalence Principle docs so validator functions must verify leader outputs using independent evidence rather than leader-output-only checks.
The update aligns the public docs with the improved skill guidance by reframing non-comparative validation as source-grounded verification, warning against schema-only validators, and steering decision/classification/scoring tasks toward comparative validation.
Validation
git diff --checknpx next buildSummary by CodeRabbit