Skip to content

test: Include all estimators (with coef_) in test_all_sklearn_estimators #1575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 28, 2025

Conversation

Schefflera-Arboricola
Copy link
Contributor

@Schefflera-Arboricola Schefflera-Arboricola commented Apr 22, 2025

closes issue #1364

  • incorporated:

    • sklearn.linear_model.GammaRegressor()
    • sklearn.linear_model.PoissonRegressor()
    • sklearn.compose.TransformedTargetRegressor()
    • sklearn.cross_decomposition.CCA()
    • sklearn.cross_decomposition.PLSCanonical()
    • sklearn.linear_model.MultiTaskElasticNet()
    • sklearn.linear_model.MultiTaskElasticNetCV()
    • sklearn.linear_model.MultiTaskLasso()
    • sklearn.linear_model.MultiTaskLassoCV()
    • sklearn.svm.OneClassSVM()
    • sklearn.linear_model.SGDOneClassSVM()
  • other changes:

    • handling estimators that does not support coef_ and/or intercept_ when creating coefficients() in EstimatorReport
    • removed sklearn.linear_model.Lasso() --> duplicate entry
    • updated _check_has_coef and coefficients() to handle cases of meta estimators (updated tests accordingly)
    • reorganised the commented models
  • need feedback on:

    • is there a better(cleaner) way to get data? -- right now I've created different fixtures for positive_regression_data, multi_regression_data, outlier_data and clustering_data -- but all these are only used once in this one test (i.e. test_all_sklearn_estimators).
    • for TransformedTargetRegressor should we modify the _check_has_coef in skore/src/skore/utils/_accessor.py to not just check if the hasattr(estimator, "coef_") is true but also check if hasattr(estimator.regressor_, "coef_") is also true (here)? or should we exclude TransformedTargetRegressor from this test?
    • even though the tests passed locally, I'm not sure about how I've incorporated sklearn.cross_decomposition.CCA() and sklearn.cross_decomposition.PLSCanonical()
    • any other feedback that you would like to give!

Thanks :)

Copy link
Contributor

github-actions bot commented Apr 23, 2025

Documentation preview @ 9d3af47

@auguste-probabl
Copy link
Contributor

is there a better(cleaner) way to get data? -- right now I've created different fixtures for positive_regression_data, multi_regression_data, outlier_data and clustering_data -- but all these are only used once in this one test (i.e. test_all_sklearn_estimators).

I think it's okay!

@Schefflera-Arboricola Schefflera-Arboricola changed the title TST: incorporating all estimators (with coef_) while testing EstimatorReport.feature_importance.coefficients() test: incorporating all estimators (with coef_) while testing EstimatorReport.feature_importance.coefficients() Apr 23, 2025
@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as ready for review April 23, 2025 13:41
Copy link
Contributor

Coverage

Coverage Report for backend
FileStmtsMissCoverMissing
venv/lib/python3.12/site-packages/skore
   __init__.py220100% 
   _config.py280100% 
   exceptions.py440%4–23
venv/lib/python3.12/site-packages/skore/persistence
   __init__.py00100% 
venv/lib/python3.12/site-packages/skore/persistence/item
   __init__.py55198%97
   altair_chart_item.py19191%14
   item.py22195%86
   matplotlib_figure_item.py36195%19
   media_item.py220100% 
   numpy_array_item.py27194%16
   pandas_dataframe_item.py29194%14
   pandas_series_item.py29194%14
   pickle_item.py220100% 
   pillow_image_item.py25193%15
   plotly_figure_item.py20192%14
   polars_dataframe_item.py27194%14
   polars_series_item.py22192%14
   primitive_item.py23291%13–15
   sklearn_base_estimator_item.py29194%15
venv/lib/python3.12/site-packages/skore/persistence/repository
   __init__.py20100% 
   item_repository.py59591%15–16, 202–203, 226
venv/lib/python3.12/site-packages/skore/persistence/storage
   __init__.py40100% 
   abstract_storage.py220100% 
   disk_cache_storage.py33195%44
   in_memory_storage.py200100% 
venv/lib/python3.12/site-packages/skore/project
   __init__.py20100% 
   project.py84298%282, 394
venv/lib/python3.12/site-packages/skore/sklearn
   __init__.py60100% 
   _base.py1711492%45, 58, 126, 129, 182–191, 203–>209, 224, 227–228
   find_ml_task.py61099%136–>145
   types.py130100% 
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
   __init__.py50100% 
   metrics_accessor.py165297%163, 164–>166, 1278
   report.py67197%17, 249–>252
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
   __init__.py50100% 
   metrics_accessor.py190099%153–>155, 155–>157
   report.py110198%23
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
   __init__.py70100% 
   feature_importance_accessor.py142198%214, 497–>503, 583–>592
   metrics_accessor.py3441096%174–183, 211–>220, 219, 249, 260–>262, 290, 317–321, 336, 371, 372–>374
   report.py145198%24, 253–>255
venv/lib/python3.12/site-packages/skore/sklearn/_plot
   __init__.py20100% 
   base.py60100% 
   style.py280100% 
   utils.py122595%51, 75–77, 81
venv/lib/python3.12/site-packages/skore/sklearn/_plot/metrics
   __init__.py40100% 
   precision_recall_curve.py173199%660
   prediction_error.py1640100% 
   roc_curve.py176199%649
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py51196%16, 154–>158
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17190%79
   high_class_imbalance_warning.py180100% 
   random_state_unset_warning.py12188%15
   shuffle_true_warning.py10183%46
   stratify_is_set_warning.py12188%15
   time_based_column_warning.py23286%17, 73
   train_test_split_warning.py40100% 
venv/lib/python3.12/site-packages/skore/utils
   __init__.py60100% 
   _accessor.py46197%102
   _environment.py27097%30–>35
   _fixes.py80100% 
   _index.py50100% 
   _logger.py22485%15–19
   _measure_time.py100100% 
   _parallel.py38388%23–33, 124
   _patch.py13553%21–37
   _progress_bar.py360100% 
   _show_versions.py330100% 
TOTAL31928396% 

Tests Skipped Failures Errors Time
823 8 💤 0 ❌ 0 🔥 54.693s ⏱️

@Schefflera-Arboricola Schefflera-Arboricola changed the title test: incorporating all estimators (with coef_) while testing EstimatorReport.feature_importance.coefficients() test: include all estimators (with coef_) in test_all_sklearn_estimators Apr 23, 2025
@auguste-probabl
Copy link
Contributor

for TransformedTargetRegressor should we modify _check_has_coef

I'd say yes!

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Schefflera-Arboricola

Just a couple of comments and questions.

@glemaitre glemaitre changed the title test: include all estimators (with coef_) in test_all_sklearn_estimators test: Include all estimators (with coef_) in test_all_sklearn_estimators Apr 23, 2025
Copy link
Contributor

@auguste-probabl auguste-probabl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks!

Copy link
Contributor

Coverage

Coverage Report for skore
FileStmtsMissCoverMissing
venv/lib/python3.12/site-packages/skore
   __init__.py220100% 
   _config.py280100% 
   exceptions.py440%4–23
venv/lib/python3.12/site-packages/skore/persistence
   __init__.py00100% 
venv/lib/python3.12/site-packages/skore/persistence/item
   __init__.py55198%97
   altair_chart_item.py19191%14
   item.py22195%86
   matplotlib_figure_item.py36195%19
   media_item.py220100% 
   numpy_array_item.py27194%16
   pandas_dataframe_item.py29194%14
   pandas_series_item.py29194%14
   pickle_item.py220100% 
   pillow_image_item.py25193%15
   plotly_figure_item.py20192%14
   polars_dataframe_item.py27194%14
   polars_series_item.py22192%14
   primitive_item.py23291%13–15
   sklearn_base_estimator_item.py29194%15
venv/lib/python3.12/site-packages/skore/persistence/repository
   __init__.py20100% 
   item_repository.py59591%15–16, 202–203, 226
venv/lib/python3.12/site-packages/skore/persistence/storage
   __init__.py40100% 
   abstract_storage.py220100% 
   disk_cache_storage.py33195%44
   in_memory_storage.py200100% 
venv/lib/python3.12/site-packages/skore/project
   __init__.py20100% 
   project.py83298%280, 392
venv/lib/python3.12/site-packages/skore/sklearn
   __init__.py60100% 
   _base.py1711492%45, 58, 126, 129, 182–191, 203–>209, 224, 227–228
   find_ml_task.py61099%136–>145
   types.py130100% 
venv/lib/python3.12/site-packages/skore/sklearn/_comparison
   __init__.py50100% 
   metrics_accessor.py165297%163, 164–>166, 1278
   report.py67197%17, 249–>252
venv/lib/python3.12/site-packages/skore/sklearn/_cross_validation
   __init__.py50100% 
   metrics_accessor.py190099%153–>155, 155–>157
   report.py110198%23
venv/lib/python3.12/site-packages/skore/sklearn/_estimator
   __init__.py70100% 
   feature_importance_accessor.py143099%497–>503, 583–>592
   metrics_accessor.py3441096%174–183, 211–>220, 219, 249, 260–>262, 290, 317–321, 336, 371, 372–>374
   report.py148198%24, 253–>255
venv/lib/python3.12/site-packages/skore/sklearn/_plot
   __init__.py20100% 
   base.py60100% 
   style.py280100% 
   utils.py122595%51, 75–77, 81
venv/lib/python3.12/site-packages/skore/sklearn/_plot/metrics
   __init__.py40100% 
   precision_recall_curve.py173199%660
   prediction_error.py1640100% 
   roc_curve.py176199%649
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split
   __init__.py00100% 
   train_test_split.py51196%16, 154–>158
venv/lib/python3.12/site-packages/skore/sklearn/train_test_split/warning
   __init__.py80100% 
   high_class_imbalance_too_few_examples_warning.py17190%80
   high_class_imbalance_warning.py180100% 
   random_state_unset_warning.py12188%15
   shuffle_true_warning.py10183%46
   stratify_is_set_warning.py12188%15
   time_based_column_warning.py23286%17, 73
   train_test_split_warning.py40100% 
venv/lib/python3.12/site-packages/skore/utils
   __init__.py60100% 
   _accessor.py52293%63–>68, 67, 108
   _environment.py27097%30–>35
   _fixes.py80100% 
   _index.py50100% 
   _logger.py22485%15–19
   _measure_time.py100100% 
   _parallel.py38388%23–33, 124
   _patch.py13553%21–37
   _progress_bar.py360100% 
   _show_versions.py330100% 
TOTAL32018396% 

Tests Skipped Failures Errors Time
816 8 💤 0 ❌ 0 🔥 53.373s ⏱️

@thomass-dev thomass-dev enabled auto-merge April 28, 2025 13:20
@thomass-dev thomass-dev added this pull request to the merge queue Apr 28, 2025
auto-merge was automatically disabled April 28, 2025 13:27

Pull Request is not mergeable

Merged via the queue into probabl-ai:main with commit e0e3be9 Apr 28, 2025
41 checks passed
github-merge-queue bot pushed a commit that referenced this pull request Apr 30, 2025
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
…mators` (probabl-ai#1575)

closes issue probabl-ai#1364

- incorporated:
    - [x] sklearn.linear_model.GammaRegressor()
    - [x] sklearn.linear_model.PoissonRegressor()
    - [x] sklearn.compose.TransformedTargetRegressor()
    - [ ] sklearn.cross_decomposition.CCA()
    - [ ] sklearn.cross_decomposition.PLSCanonical()
    - [ ] sklearn.linear_model.MultiTaskElasticNet()
    - [ ] sklearn.linear_model.MultiTaskElasticNetCV()
    - [ ] sklearn.linear_model.MultiTaskLasso()
    - [ ] sklearn.linear_model.MultiTaskLassoCV()
    - [ ] sklearn.svm.OneClassSVM()
    - [ ] sklearn.linear_model.SGDOneClassSVM()

- other changes: 
- [x] handling estimators that does not support `coef_` and/or
`intercept_` when creating `coefficients()` in EstimatorReport
    - [x] removed sklearn.linear_model.Lasso() --> duplicate entry
- [x] updated `_check_has_coef` and `coefficients()` to handle cases of
meta estimators (updated tests accordingly)
    - [x] reorganised the commented models

- need feedback on:
- [x] is there a better(cleaner) way to get data? -- right now I've
created different fixtures for `positive_regression_data`,
`multi_regression_data`, `outlier_data` and `clustering_data` -- but all
these are only used once in this one test (i.e.
`test_all_sklearn_estimators`).
- [x] for `TransformedTargetRegressor` should we modify the
`_check_has_coef` in `skore/src/skore/utils/_accessor.py` to not just
check if the `hasattr(estimator, "coef_")` is true but also check if
`hasattr(estimator.regressor_, "coef_")` is also true
([here](https://github.com/probabl-ai/skore/blob/main/skore/src/skore/utils/_accessor.py#L60))?
or should we exclude `TransformedTargetRegressor` from this test?
- [x] even though the tests passed locally, I'm not sure about how I've
incorporated `sklearn.cross_decomposition.CCA()` and
`sklearn.cross_decomposition.PLSCanonical()`
    - [ ] any other feedback that you would like to give!

Thanks :)

---------

Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Auguste Baum <[email protected]>
Muhammad-Rebaal pushed a commit to Muhammad-Rebaal/skore that referenced this pull request May 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants