Skip to content

Releases: oegedijk/explainerdashboard

v0.3.5: skorch support, simplified dashboard and feature descriptions

07 May 11:11
5149c95
Compare
Choose a tag to compare

New Features

  • adds support for PyTorch Neural Networks! (as long as they are wrapped by skorch)
  • adds SimplifiedClassifierComposite and SimplifiedRegressionComposite to explainerdashboard.custom
  • adds flag simple=True to load these simplified one page dashboards: ExplainerDashboard(explainer, simple=True)
  • adds support for visualizing trees of ExtraTreesClassifier and ExtraTreesRegressor
  • adds FeatureDescriptionsComponent to explainerdashboard.custom and the Importances tab
  • adds possibility to dynamically add new dashboards to running ExplainerHub using /add_dashboard route
    with add_dashboard_route=True (will only work if you're running the Hub as a single worker/node though!)

Improvements

  • ExplainerDashboard.to_yaml("dashboards/dashboard.yaml", dump_explainer=True)
    will now dump the explainer in the correct subdirectory (and also default
    to explainer.joblib)
  • Interactions tab automatically excluded for linear models

v0.3.4.1: fixes detailed shap plots bug when cats=None

05 May 08:35
Compare
Choose a tag to compare

Fixes dtreeviz 1.3 breaking change bug

13 Apr 18:15
479746e
Compare
Choose a tag to compare

Release Notes

Version 0.3.4:

Bug Fixes

  • Fixes incompatibility bug with dtreeviz >= 1.3
  • Fixes ExplainerHub dbc.Jumbotron style bug

Improvements

  • raises ValueError when passing shap='deep' as it is not yet correctly supported

v0.3.3.1: minor bugfix with outliers and nan

22 Mar 19:11
d46b8fc
Compare
Choose a tag to compare

Fixes a bug with removing outliers when nan's are present.

v0.3.3: better pipeline support and thread safety

11 Mar 19:15
f5bd4a5
Compare
Choose a tag to compare

Version 0.3.3:

Highlights:

  • Adding support for cross validated metrics
  • Better support for pipelines by using kernel explainer
  • Making explainer threadsafe by adding locks
  • Remove outliers from shap dependence plots

Breaking Changes

  • parameter permutation_cv has been deprecated and replaced by parameter cv which
    now also works to calculate cross-validated metrics besides cross-validated
    permutation importances.

New Features

  • metrics now get calculated with cross validation over X when you pass the
    cv parameter to the explainer, this is useful when for some reason you
    want to pass the training set to the explainer.
  • adds winsorization to shap dependence and shap interaction plots
  • If shap='guess' fails (unable to guess the right type of shap explainer),
    then default to the model agnostic shap='kernel'.
  • Better support for sklearn Pipelines: if not able to extract transformer+model,
    then default to shap.KernelExplainer to explain the entire pipeline
  • you can now remove outliers from shap dependence/interaction plots with
    remove_outliers=True: filters all outliers beyond 1.5*IQR

Bug Fixes

  • Sets proper threading.Locks before making calls to shap explainer to prevent race
    conditions with dashboards calling for shap values in multiple threads.
    (shap is unfortunately not threadsafe)

Improvements

  • single shap row KernelExplainer calculations now go without tqdm progress bar
  • added cutoff tpr anf fpr to roc auc plot
  • added cutoff precision and recall to pr auc plot
  • put a loading spinner on shap contrib table

v0.3.2.2: more bugfixes

03 Mar 19:28
dfc4b5a
Compare
Choose a tag to compare

Version 0.3.2.2:

index_dropdown=False now works for indexes not listed in set_index_list_func()
as long as it can be found by set_index_exists_func

New Features

  • adds set_index_exists_func to add function that checks for index existing
    besides those listed by set_index_list_func()

Bug Fixes

  • bug fix to make shap.KernelExplainer (used with explainer parametershap='kernel')
    work with RegressionExplainer
  • bug fix when no explicit labels are passed with index selector
  • component only update if explainer.index_exists(): no IndexNotFoundErrors anymore.
  • fixed title for regression index selector labeled 'Custom' bug
  • get_y() now returns .item() when necessary
  • removed ticks from confusion matrix plot when no labels param passed
    (this bug got reintroduced in recent plotly release)

Improvements

  • new helper function get_shap_row(index) to calculate or look up a single
    row of shap values.

v0.3.2.1: add index_dropdown=False to regression dashboard

26 Feb 19:06
05dfa18
Compare
Choose a tag to compare

Bugfix: new index_dropdown=False feature was not working correctly for regression dashboards

v0.3.2: custom metrics

25 Feb 19:50
3884b8e
Compare
Choose a tag to compare

Version 0.3.2:

Highlights:

  • Control what metrics to show or use your own custom metrics using show_metrics
  • Set the naming for onehot features with all 0s with cats_notencoded
  • Speed up plots by displaying only a random sample of markers in scatter plots with plot_sample.
  • make index selection a free text field with index_dropdown=False

New Features

  • new parameter show_metrics for both explainer.metrics(), ClassifierModelSummaryComponent
    and RegressionModelSummaryComponent:
    • pass a list of metrics and only display those metrics in that order
    • you can also pass custom scoring functions as long as they
      are of the form metric_func(y_true, y_pred): show_metrics=[metric_func]
      • For ClassifierExplainer what is passed to the custom metric function
        depends on whether the function takes additional parameters cutoff
        and pos_label. If these are not arguments, then y_true=self.y_binary(pos_label)
        and y_pred=np.where(self.pred_probas(pos_label)>cutoff, 1, 0).
        Else the raw self.y and self.pred_probas are passed for the
        custom metric function to do something with.
      • custom functions are also stored to dashboard.yaml and imported upon
        loading ExplainerDashboard.from_config()
  • new parameter cats_notencoded: a dict to indicate how to name the value
    of a onehotencoded features when all onehot columns equal 0. Defaults
    to 'NOT_ENCODED', but can be adjusted with this parameter. E.g.
    cats_notencoded=dict(Deck="Deck not known").
  • new parameter plot_sample to only plot a random sample in the various
    scatter plots. When you have a large dataset, this may significantly
    speed up various plots without sacrificing much in expressiveness:
    ExplainerDashboard(explainer, plot_sample=1000).run
  • new parameter index_dropdown=False will replace the index dropdowns with a
    free text field. This can be useful when you have a lot of potential indexes,
    and the user is expected to know the index string.
    Input will be checked for validity with explainer.index_exists(index),
    and field indicates when input index does not exist. If index does not exist,
    will not be forwarded to other components, unless you also set index_check=False.
  • adds mean absolute percentage error to the regression metrics. If it is too
    large a warning will be printed. Can be excluded with the new show_metrics
    parameter.

Bug Fixes

  • get_classification_df added to ClassificationComponent dependencies.

Improvements

  • accepting single column pd.Dataframe for y, and automatically converting
    it to a pd.Series
  • if WhatIf FeatureInputComponent detects the presence of missing onehot features
    (i.e. rows where all columns of the onehotencoded feature equal 0), then
    adds 'NOT_ENCODED' or the matching value from cats_notencoded to the
    dropdown options.
  • Generating name for parameters for ExplainerComponents for which no
    name is given is now done with a determinative process instead of a random
    uuid. This should help with scaling custom dashboards across cluster
    deployments. Also drops shortuuid dependency.
  • ExplainerDashboard now prints out local ip address when starting dashboard.
  • get_index_list() is only called once upon starting dashboard.

v0.3.1: responsive classifier components

31 Jan 13:46
Compare
Choose a tag to compare

Version 0.3.1:

This version is mostly about pre-calculating and optimizing the classifier statistics
components. Those components should now be much more responsive with large datasets.

New Features

  • new methods roc_auc_curve(pos_label) and pr_auc_curve(pos_label)
  • new method get_classification_df(...) to get dataframe with number of labels
    above and below a given cutoff.
    • this now gets used by plot_classification(..)
  • new method confusion_matrix(cutoff, binary, pos_label)
  • added parameters sort_features to FeatureInputComponent:
    • defaults to 'shap': order features by mean absolute shap
    • if set to 'alphabet' features are sorted alphabetically
  • added parameter fill_row_first to FeatureInputComponent:
    • defaults to True: fill first row first, then next row, etc
    • if False: fill first column first, then second column, etc

Bug Fixes

  • categorical mappings now updateable with pandas<=1.2 and python==3.6
  • title now overridable for RegressionRandomIndexComponent
  • added assert check on summary_type for ShapSummaryComponent

Improvements

  • pre-Calculating lift_curve_df only once and then storing for each pos_label
    • plus: storing only 100 evenly spaced rows of lift_curve_df
    • dashboard should be more responsive for large datasets
  • pre-calculating roc_auc_curve and pr_auc_curve
    • dashboard should be more responsive for large datasets
  • pre-calculating confusion matrices
    • dashboard should be more responsive for large datasets
  • pre-calculating classification_dfs
    • dashboard should be more responsive for large datasets
  • confusion matrix: added axis title, moved predicted labels to bottom of graph
  • precision plot component: when only adjusting cutoff, simply updating the cutoff
    line, without recalculating the plot.

v0.3.0.1: dependency fixes

27 Jan 15:02
304918b
Compare
Choose a tag to compare

version 0.3.0.1:

Some of the new features of version 0.3 only work with pandas>=1.2, which is not available for python 3.6.

Breaking Changes

  • new dependency requirements pandas>=1.2 also implies python>=3.7

Bug Fixes

  • updates pandas version to be compatible with categorical feature operations
  • updates dtreeviz version to make xgboost and pyspark dependencies optional