Skip to content

#3 [Attribution map across channel] Enhancement toward project completion #7

@HugoHakem

Description

@HugoHakem

Enhanced masking strategies

Our interpretation method relies on attribution map to understand what features are most predominant in a classifier decision when differentiating between two class.

Right now, attribution map are averaged across all 5 channels.

When averaging contribution across all 5 channels, we loose granularity: maybe the most predominant features are specifically into a single channel !
Why is it important to improve on that?

  1. Better interpretation
  2. Novelty over the Discriminative Attribution from Counterfactuals paper
    They didn't have the use case for that because each channel of their image (RGB) are complementary view of the same object. In our case, the 5 channels give different information each, then it is totally possible that one channel is predominant in the classifier decision.
  3. Will facilitate interpretation: if one channel happen to be clearly predominant this might be easier to connect with the existing method which use cell profiler and do statistical test to pull up a list of most important features. Those features are often associated to a specific metric within one channel. Then, already saying: the masking of that particular channel tends to be bigger than other channels gives insight and could corroborate with the cell profiler feature interpretation.

Solution:
Each attribution methods already gives a map that is different pair channel with the exception of GradCam:

  • With GradCam, attribution are computed per channel at a specific block of the convolutional pass (usually the last block) then averaged across all channel (this could be an average across 512 channel for example) and then projected onto the input space (5 channel with 128 images).
  • With GuidedGradCam the problem disappear.

Alternatively, we could keep the averaging across all 5 channels, but when plotting the mask, we could plot each channel independently to investigate whether a pattern is visible already (for example by looking at the residual map of the mask image differentiated across all 5 channels)

How to build a mask in the case we have 5 channel attribution map?

  1. Instead of normalizing feature importance in attribution map per channel, we could do it across all 5. Then we build the masking scenario by rank of each features in every channel.
    Good: seems natural.
    Bad:
    i. will take a while computationally since we are no longer building a single mask, but 5 mask at the same time: (originally 200 steps, now 1000 steps).
    ii. What guarantee that the hybrid image resulting from altering the counterfactual image with the real image will be coherent?
    Imagine: you may have the DNA channel that is the most predominant in classifier decision, meaning you will basically switch both cells' kernel in counterfactual and real images. As you do so, the expected behavior is that the hybrid image is classify more and more as looking like the real perturbation than the counterfactual one. Yet, it might be difficult to classify the hybrid image because the hybrid image doesn't even look like a cell image at all.
  2. We normalize features importance per channel. We build all 5 channel mask at the same time.
    Good: faster
    Bad: hybrid image might not be coherent again as different portion of the image might be important across all 5 channels.

How to assess the relevance of this?

  • Either the DAC score are way more favorable, meaning attribution methods can capture most important features in a way more accurate way when looking 5 channels independently.
  • Either, before even thinking about creating custom mask, we implement some quantitative method that will tell us how different attribution map are different across channel.
    i. This is interesting because this will justify why it is even better to go for that an attribution method differentiated across channel rather than aggregated.

My go to would be the first option.

  • Add option whether to aggregate attribution map across all 5 channels or not.
  • Implement masking scenario and DAC curves across all 5 channels.

Note:
When creating the masks, we might want to explore different smoothing scenario (the closing and gaussian smoothing). Maybe the parameter we are using are too strong and prevent from discovering any biology because it is smoothed out.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestquestionFurther information is requested

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions