Skip to content

πŸš€ feat(model): Add Dinomaly Model #2835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

rajeshgangireddy
Copy link
Contributor

@rajeshgangireddy rajeshgangireddy commented Jul 15, 2025

πŸ“ Description

Pending:

  • Documentation Update
  • Benchmark and Comparison
  • Make Ruff/Linters/Semgrep happy

✨ Changes

Select what type of change your PR is:

  • πŸš€ New feature (non-breaking change which adds functionality)
  • 🐞 Bug fix (non-breaking change which fixes an issue)
  • πŸ”„ Refactor (non-breaking change which refactors the code base)
  • ⚑ Performance improvements
  • 🎨 Style changes (code style/formatting)
  • πŸ§ͺ Tests (adding/modifying tests)
  • πŸ“š Documentation update
  • πŸ“¦ Build system changes
  • 🚧 CI/CD configuration
  • πŸ”§ Chore (general maintenance)
  • πŸ”’ Security update
  • πŸ’₯ Breaking change (fix or feature that would cause existing functionality to not work as expected)

βœ… Checklist

Before you submit your pull request, please make sure you have completed the following steps:

  • πŸ“š I have made the necessary updates to the documentation (if applicable).
  • πŸ§ͺ I have written tests that support my changes and prove that my fix is effective or my feature works (if applicable).
  • 🏷️ My PR title follows conventional commit format.

For more information about code review checklists, see the Code Review Checklist.

Anomaly Maps

002 image image

###Metrics
*NOTE : The Pixel_F1 are slighly off as the paper publishes F1_Max and I benchmarked F1Score. I am running some benchmarks to get F1_Max and will update the table.
Overall, it looks like the difference is less than 2%

<style> </style>
MVTEC Category image_AUROC (Paper) image_AUROC (Anomalib) image_F1Score (Paper) image_F1Score (Anomalib) pixel_AUROC (Paper) pixel_AUROC (Anomalib) pixel_F1Score (Paper) pixel_F1Score (Anomalib)
Bottle 1.000 1.000 1.000 1.000 0.992 0.990 0.842 0.820
Cable 1.000 1.000 1.000 0.995 0.986 0.981 0.743 0.715
Capsule 0.979 0.988 0.977 0.982 0.987 0.986 0.603 0.562
Carpet 0.999 0.998 0.989 0.983 0.993 0.993 0.711 0.697
Grid 1.000 0.999 0.991 0.991 0.994 0.993 0.577 0.508
HazelNut 1.000 1.000 1.000 1.000 0.994 0.994 0.764 0.754
Leather 1.000 1.000 1.000 0.995 0.994 0.993 0.550 0.494
Metal Nut 0.991 1.000 1.000 1.000 0.969 0.969 0.867 0.854
Pill 0.984 0.993 0.983 0.986 0.978 0.977 0.716 0.678
Screw 1.000 0.985 0.961 0.957 0.996 0.997 0.596 0.569
Tile 0.998 1.000 1.000 0.994 0.981 0.975 0.757 0.736
ToothBrush 0.990 1.000 1.000 0.983 0.989 0.988 0.626 0.598
Transistor 1.000 0.997 0.964 0.976 0.932 0.950 0.585 0.612
Wood 0.996 0.993 0.992 0.975 0.976 0.975 0.684 0.660
Β  Β  Β  Β  Β  Β  Β  Β  Β 
Mean 0.9955 0.9966 0.9898 0.9869 0.9829 0.9829 0.6872 0.6612
Difference % 100*((P-A)/P) Β  -0.1148 Β  0.2887 Β  0.0000 Β  3.7834

…dation steps

- Added detailed docstrings for the Dinomaly class and its methods.
- Improved error handling in training and validation steps.
- Updated pre-processor configuration to include crop size validation.
- Refined output structure in the training step for clarity.
…zer configuration; enhance Gaussian kernel function
… SSLMetaArch implementation

- Deleted `train.py`, `__init__.py`, and `ssl_meta_arch.py` files from the DINOv2 training module.
- Removed unused imports and commented-out code in `vit_encoder.py`.
- Streamlined the model loading process and eliminated unnecessary complexity in the architecture.
- Ensured that the remaining code adheres to the latest standards and practices for clarity and maintainability.
- Rearranged import statements for better organization and consistency.
- Updated type hints to use the new syntax for optional types.
- Simplified conditional checks and improved readability in various functions.
- Enhanced logging messages for clarity during model loading and training.
- Modified the `get_params_groups_with_decay` function to improve parameter handling.
- Updated the `DinoV2Loader` class to streamline model loading and weight management.
- Improved the `ViTill` class by refining feature processing and anomaly map calculations.
- Adjusted the `simple_script.py` to utilize the new export types for model exporting.
- Reduced the number of epochs in the training script for quicker testing.
… clarity and accuracy

style: adjust training configuration in simple_script.py
…integration

refactor: enhance training configuration and streamline model initialization in ViTill
chore: add benchmark configuration and script for Padim model evaluation
fix: update simple script for MVTecAD category and improve timing output
…related utilities

refactor: update attention and drop path layers for improved efficiency and clarity
… timm library equivalents and clean up unused code
… handling

- Added type hints and ClassVar annotations in model_loader.py for better clarity and type checking.
- Enhanced error messages in model_loader.py to provide clearer guidance on model name and architecture issues.
- Updated global_cosine_hm_percent and modify_grad functions in utils.py with type hints and improved gradient modification logic.
- Improved documentation and type hints in vision_transformer.py, including detailed docstrings for methods and parameters.
- Refined training configuration in lightning_model.py with type hints and assertions for better validation of input parameters.
- Enhanced ViTill class in torch_model.py with static methods and type safety checks for architecture configuration.
- General code cleanup and consistency improvements across all modified files.
…er; remove unused max_steps from training config
@rajeshgangireddy rajeshgangireddy changed the title [DRAFT| DO NOT REVIEW YET] πŸš€ feat(model): Add Dinomaly Model πŸš€ feat(model): Add Dinomaly Model Jul 17, 2025
@rajeshgangireddy rajeshgangireddy marked this pull request as ready for review July 17, 2025 07:01
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds the Dinomaly modelβ€”a Vision Transformer-based anomaly detection approachβ€”into anomalib, including both a raw PyTorch implementation and a Lightning training/inference module.

  • Introduces ViTill PyTorch model with encoder, bottleneck, decoder and map computation
  • Adds Dinomaly LightningModule for training, validation, optimizer/scheduler setup
  • Bundles DINOv2 transformer components, training utilities (loss, optimizer, scheduler), model loader, layer definitions, and example config

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/anomalib/models/image/dinomaly/torch_model.py Core ViTill model implementation & anomaly-map computation
src/anomalib/models/image/dinomaly/lightning_model.py LightningModule wrapping ViTill for training/inference
src/anomalib/models/image/dinomaly/components/vision_transformer.py DINOv2 Vision Transformer implementation
src/anomalib/models/image/dinomaly/components/training_utils.py Loss, optimizer, scheduler utilities
src/anomalib/models/image/dinomaly/components/model_loader.py Loader for pretrained DINOv2 weights
src/anomalib/models/image/dinomaly/components/layers.py Attention, MLP, transformer block components
src/anomalib/models/image/dinomaly/components/init.py Component exports
src/anomalib/models/image/dinomaly/init.py Dinomaly module export
src/anomalib/models/image/init.py Registered Dinomaly in image models
src/anomalib/models/init.py Registered Dinomaly globally
examples/configs/model/dinomaly.yaml Example YAML config for Dinomaly
src/anomalib/models/image/dinomaly/README.md Model README with usage and benchmarks placeholder

Copy link
Contributor

@samet-akcay samet-akcay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rajeshgangireddy, overall the code is quite clean and easy to follow!

I've added some comments to discuss

on increasingly difficult examples as training progresses.
"""
del args, kwargs # These variables are not used.
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would there be a training error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also is it possible to move this loss computation to the torch model by any chance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. It is now removed.
Loss calculation for this model depends on an additional components - p which get's updated based on the number of trained epochs.

p (float): Percentage of well-reconstructed points to down-weight (0.0 to 1.0).
                Higher values make training focus on fewer, harder examples. Default is 0.9.
                

That's why lightning model (which has access to this) is a good place to calculate loss.
Let me think, if we can move loss to torch model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm tricky let's discuss this. This makes it valid to keep loss in lightning module. Let's discuss what should be the scope of torch model and what should be in the lightning model. I think we need to refactor all the models to ensure that our inferencers work correctly.

max_epochs * len(self.trainer.datamodule.train_dataloader()),
)

optimizer_config = TRAINING_CONFIG["optimizer"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinvaidya17, can you confirm if this dict-based config is API/CLI friendly ?

Copy link
Contributor

@ashwinvaidya17 ashwinvaidya17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! A few comments

Refactor Dinomaly model as per comments plus some  small improvements.

Attention class was used only for type checking - replaced it with attention from timm.
Separated loss and optimiser into different files.
Code readability improvements.
Added explicitly pass the image size to create anomaly maps instead of a default.
Docstring improvements.
Removed unnecessary TRY CATCH blocks
@samet-akcay samet-akcay added this to the v2.1.0 milestone Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

πŸ“‹ [TASK] Implement Dinomaly - CVPR 2025
3 participants