Dilated Convolution with Learnable Spacing (DCLS) - Enhanced Interpretability Evaluation with GradCAM
Dilated Convolution with Learnable Spacing (DCLS) is a cutting-edge convolution method that expands the receptive fields (RF) without increasing the number of parameters, similar to dilated convolution but without enforcing a regular grid. DCLS has demonstrated superior performance compared to standard and dilated convolutions across several computer vision benchmarks. This project aims to showcase that DCLS also enhances model interpretability, measured by the alignment with human visual strategies.
- Enlarged Receptive Fields: DCLS increases RF without additional parameters.
- Enhanced Interpretability: Improves alignment with human visual attention.
- Spearman Correlation Metric: Uses Spearman correlation between models’ Grad-CAM heatmaps and ClickMe dataset heatmaps to quantify interpretability.
- Threshold-Grad-CAM: Introduces a modification to Grad-CAM to address interpretability issues in specific models.
Eight reference models were evaluated in this study:
- ResNet50
- ConvNeXt (T, S, and B variants)
- CAFormer
- ConvFormer
- FastViT (sa_24 and sa_36 variants)
- Interpretability Improvement: Seven out of eight models showed improved interpretability with DCLS.
- Grad-CAM Issue: CAFormer and ConvFormer models generated random heatmaps, resulting in low interpretability scores.
- Threshold-Grad-CAM Solution: Enhanced interpretability across nearly all models when implemented.
- Code: Implementation of DCLS and Threshold-Grad-CAM.
- Checkpoints: Pre-trained model checkpoints to reproduce the study results.
- Scripts: Scripts to evaluate models and compute interpretability scores.
- Python 3.8+
- PyTorch 1.8+
- CUDA 10.2+ (for GPU acceleration)
- Clone the repository:
git clone https://github.com/rabihchamas/DCLS-GradCAM-Eval.git cd DCLS-GradCAM-Eval
- Install required packages:
pip install -r requirements.txt
- Python 3.8+
- PyTorch 1.8+
- CUDA 10.2+ (for GPU acceleration)
- Evaluate Models: Run the evaluation script to compute interpretability scores.
python main.py
We welcome contributions to improve the project. Please submit pull requests or report issues to the GitHub repository.
This project is licensed under the MIT License. See the LICENSE
file for more details.
For any questions or inquiries, please contact Rabih Chamas at [email protected].