Self-Compressing Depthwise Attention – Compression Module

This folder contains the code, datasets, model architectures, and training scripts related to the compression experiments for self-compressing vision models.

Folder Structure


compression/
├── **pycache**/                  # Python bytecode cache
├── after\_pe.png                  # Example visualization or positional encoding result
├── assets/                       # Model checkpoints from various experiments
├── data/                         # Dataset loaders and preprocessing scripts
│   ├── country\_data.py           # Loader for Country211 dataset
│   └── mnist\_data.py             # Loader for MNIST dataset
├── qmodules/                     # Quantized modules and model architectures
│   ├── QConv.py                  # Quantized convolution layer
│   ├── QLinear.py                # Quantized linear layer
│   ├── QEagerLinAttn.py          # Quantized linear attention module
│   ├── QPE.py                    # Quantized positional encoding
│   ├── quant\_func.py             # Quantization helper functions
│   ├── qutils.py                 # Utility functions for quantization
│   ├── Models/                   # Model architectures
│   │   ├── ConvModel.py
│   │   ├── PureConvModel.py
│   │   └── eagerDWLin.py
│   ├── fused\_kernels/            # Optimized CUDA kernels for attention
│   └── spec/                     # Submodule implementations
├── QTrainer.py                   # Training utilities for quantized models
├── train.py                       # Main training script
└── utils.py                        # General utility functions

Usage

Preparing datasets:
- data/country_data.py and data/mnist_data.py handle loading and preprocessing.
- Ensure required datasets are downloaded and accessible.
Training models:
- Use train.py for running experiments.
- QTrainer.py provides helper functions for training quantized networks.
Model architectures:
- Available under qmodules/Models/.
- Includes ConvModel, PureConvModel, and eagerDWLin.
- Quantized layers (QConv, QLinear, QEagerLinAttn) are modular and can be swapped in different architectures.
Checkpoints:
- Stored in assets/.
- Named according to experiment type and date.
- Can be loaded for evaluation or fine-tuning.
Utilities:
- qutils.py and quant_func.py contain helper functions for quantization and mixed-precision operations.
- utils.py contains general-purpose helper functions for training and evaluation.

Notes

The project uses PyTorch and custom CUDA kernels for efficient training and inference of compressed models.
__pycache__/ folders contain compiled bytecode and can be safely ignored.
Ensure CUDA-enabled hardware is available for training with fused kernels and attention modules.

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
article		article
compression		compression
scratchpad		scratchpad
.DS_Store		.DS_Store
.gitignore		.gitignore
.nojekyll		.nojekyll
README.md		README.md
index.html		index.html
other_requirements.txt		other_requirements.txt
source_me.zsh		source_me.zsh
thesis_index.html		thesis_index.html
torch_requirements.txt		torch_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Compressing Depthwise Attention – Compression Module

Folder Structure

Usage

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Self-Compressing Depthwise Attention – Compression Module

Folder Structure

Usage

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages