Add Weighted Multi-Constraint Scoring Engine#53
Conversation
This commit introduces a new scoring subsystem to the C++ parallel evaluation framework. The engine calculates a Composite Stability Score (0-100) using configurable `WeightProfile` abstractions, and wraps outputs in a detailed `ScoringReport`. - Created `scoring/WeightedScoringEngine`, `scoring/WeightProfile`, and `scoring/ScoringPipeline`. - Updated `MultiBundleEvaluator` to pass results into the scoring engine via new `.score()` methods. - Added extensive documentation: `docs/scoring_engine.md` and `docs/weight_profiles.md`. - Bumped version from 2.4.1 to 3.0.0 and updated `CHANGELOG.md`. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
There was a problem hiding this comment.
Pull request overview
Adds a new weighted scoring subsystem that consumes EvaluationReport outputs and produces a composite “stability” score via configurable weight profiles, alongside a major version bump to 3.0.0.
Changes:
- Introduced
WeightProfileabstractions (withDefaultResearchProfileandHighSafetyProfile) plus a newWeightedScoringEngine,ScoringPipeline, andScoringReport. - Integrated scoring into
MultiBundleEvaluatorviascore(...)andscore_with_profile(...). - Bumped project version to
3.0.0and added/updated scoring documentation and changelog entries.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| setup.py | Bumps package version to 3.0.0. |
| cura_frame/init.py | Bumps Python package __version__ to 3.0.0. |
| README.md | Updates citation version to 3.0.0. |
| CITATION.cff | Updates version fields to 3.0.0. |
| docs/CHANGELOG.md | Adds 3.0.0 changelog entry describing the scoring engine and profiles. |
| docs/scoring_engine.md | Documents the scoring engine architecture and usage examples. |
| docs/weight_profiles.md | Documents built-in weight profiles and extension guidance. |
| scoring/WeightProfile.hpp | Defines the weight profile interface and two built-in profiles. |
| scoring/WeightProfile.cpp | Placeholder translation unit for weight profiles. |
| scoring/ScoringReport.hpp | Defines the structured scoring output. |
| scoring/WeightedScoringEngine.hpp | Declares the core scoring engine API. |
| scoring/WeightedScoringEngine.cpp | Implements weighted aggregation, falsification impact, and narrative generation. |
| scoring/ScoringPipeline.hpp | Declares a pipeline wrapper around the scoring engine. |
| scoring/ScoringPipeline.cpp | Implements pipeline execution. |
| constraint_core/MultiBundleEvaluator.hpp | Adds scoring entry points that run the new pipeline on an EvaluationReport. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| WeightedScoringEngine::WeightedScoringEngine(std::shared_ptr<WeightProfile> profile) | ||
| : profile_(std::move(profile)) {} | ||
|
|
||
| ScoringReport WeightedScoringEngine::score(const EvaluationReport& eval_report) const { | ||
| ScoringReport report; | ||
| report.weight_profile_name = profile_->name(); | ||
|
|
There was a problem hiding this comment.
profile_ is dereferenced without validating it is non-null (e.g., profile_->name()), so passing a null WeightProfile will cause an immediate crash. Consider enforcing non-null in the constructor (throw/assert) and/or making the constructor take a WeightProfile& / std::shared_ptr<WeightProfile> that is checked before use.
| std::string WeightedScoringEngine::generate_narrative(const ScoringReport& report, const EvaluationReport& eval_report) const { | ||
| std::ostringstream oss; | ||
|
|
There was a problem hiding this comment.
eval_report is not used in generate_narrative, which will trigger -Wunused-parameter warnings in many build configurations. Either remove the parameter from the declaration/definition, or mark it unused (e.g., omit the name in the signature or use [[maybe_unused]]).
| // Default fallback weight if a bundle or signal is not explicitly defined | ||
| virtual double default_weight() const { return 1.0; } | ||
|
|
||
| // Per-bundle weights | ||
| virtual double bundle_weight(const std::string& bundle_name) const = 0; | ||
|
|
||
| // Per-signal weights (can override bundle weights for specific signals) | ||
| virtual double signal_weight(const std::string& bundle_name, const std::string& signal_name) const = 0; | ||
|
|
There was a problem hiding this comment.
The comment says default_weight() is a fallback when a bundle/signal isn't explicitly defined, but bundle_weight/signal_weight are pure virtual so the base class cannot actually provide that fallback behavior. Consider providing non-pure virtual defaults (e.g., bundle_weight returning default_weight() and signal_weight returning bundle_weight(bundle_name)), or update the docs/comments to clarify that derived profiles must implement their own fallback logic.
| double signal_weight(const std::string& bundle_name, const std::string& signal_name) const override { | ||
| return 1.0; | ||
| } |
There was a problem hiding this comment.
These overrides ignore their parameters, which can produce -Wunused-parameter warnings. Consider omitting unused parameter names in the signature or marking them [[maybe_unused]].
| double signal_weight(const std::string& bundle_name, const std::string& signal_name) const override { | ||
| if (signal_name.find("toxicity") != std::string::npos || signal_name.find("risk") != std::string::npos) { | ||
| return 1.5; | ||
| } | ||
| return 1.0; |
There was a problem hiding this comment.
bundle_name is unused here, which can produce -Wunused-parameter warnings. Consider omitting the parameter name in the signature or marking it [[maybe_unused]].
| // Pass evaluation results to the scoring engine using a specific profile | ||
| ScoringReport score_with_profile(const EvaluationReport& report, std::shared_ptr<WeightProfile> profile) const { | ||
| ScoringPipeline pipeline(profile); | ||
| return pipeline.execute(report); | ||
| } |
There was a problem hiding this comment.
score_with_profile accepts a std::shared_ptr<WeightProfile> that may be null; downstream code dereferences the profile without checks, which can crash. Consider validating profile (throw/assert) and/or falling back to DefaultResearchProfile when profile == nullptr.
Implemented the Weighted Multi-Constraint Scoring Engine to sit atop constraint bundles, aggregating domain-specific penalties, bonuses, and falsifications into a composite stability score without overriding original logic. Added WeightProfiles (DefaultResearchProfile, HighSafetyProfile) to contextualize candidate scrutiny. Includes relevant version bumps to v3.0.0 and extensive documentation additions.
PR created automatically by Jules for task 11354591928326284294 started by @dfeen87