You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Concept bottleneck models (CBMs) are deep learning models which are designed to be interpretable and intervenable. Instead of training a model which takes inputs and outputs a label, a CBM comprises of an encoder module which takes inputs and encodes these to a set of human interpretable concepts. From there, these concepts are fed into a predictor module which outputs the final labels. The concept layer can be used to explain which features of an input led to the final prediction, and can be intervened on, or corrected, to improve predictions on a particular input.
This repo contains example CBM(s) and associated evaluation metrics.
Tools used to assess concept alignment:
Saliency maps
Masking relevant image locations for selected concepts