-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Feature: Add Imbalanced Data Policy to Active Learning Environment
Context
Currently, the active learning environment does not explicitly handle class imbalance. This can negatively impact model performance, especially in scenarios where minority classes are underrepresented.
There is relevant literature that could guide this implementation, such as:
Minority Class Oriented Active Learning for Imbalanced Datasets
https://arxiv.org/pdf/2202.00390
Proposal
Add an imbalance-aware data policy to the Active Learning module.
Possible directions:
- Implement a minority-class-oriented sampling strategy
- Introduce heuristics to prioritize underrepresented classes
- Allow dynamic policy adjustment based on dataset distribution
Open Questions
- Should we support heuristic rules based on RegEx queries?
- Should the active learning policy be dynamically adjusted based on RegEx-defined filters?
- How configurable should the imbalance strategy be (fixed strategy vs user-defined)?
💡 Future Related Ideas
- Model personalization based on feature subsets
- Adaptive strategies that evolve during AL cycles
- Visualization of class distribution over iterations
Reactions are currently unavailable