-
Notifications
You must be signed in to change notification settings - Fork 4
Add model calibration #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ptuan5
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in the scripts overall look good to me.
Cannot comment on the changes to the ui part because it's beyond my comprehension.
| with warnings.catch_warnings(): | ||
| warnings.simplefilter("ignore", category=FutureWarning) | ||
| self._classifier = self._fit_xgboost(features, labels, random_seed=random_seed) | ||
| self._classifier.fit(self._clean_features_for_training(features), labels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you use clean_features here for the calibrated branch, but not for the old, original branch?
| Args: | ||
| truth: Binary ground truth labels (0 or 1). | ||
| probabilities: Predicted probabilities (2D array where second column is positive class). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for brier_score, accepting both 1D and 2D arrays, but here you only accept 2D arrays. Is there a reason for this? Should we enforce 2D altogether?
| if pos == 0 or neg == 0: | ||
| warnings.warn( | ||
| "plot_reliability: need both positive and negative labels.", stacklevel=2 | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we need both positive and negative for this, should we either raise an error or return early?
Overview: Probability Calibration via CalibratedClassifierCV
What it is
CalibratedClassifierCV wraps any classifier (RF/GBT/XGBoost, etc.) and learns a mapping from raw model scores to well-calibrated probabilities. It does this with an internal cross-validation loop: in each fold, it fits the base model on train-fold data and learns a calibration function on the fold’s held-out data. Two calibration methods are supported:
Why calibration matters
Tree models often output overconfident probabilities (lots of ~0.0 or ~1.0). Calibration fixes this so that predicted probabilities reflect reality (e.g., among samples with p≈0.7, ~70% are positive). Better calibration improves:
How we use it in JABS
Practical guidance
Trade-offs
Net impact for JABS
See Also
User Guide
I'm going to hold off on updating the user guide until I'm sure we're going to merge these changes
Settings Dialog Screenshots