COVR is a test-bed for visually-grounded compositional generalization with real images. You can use COVR to train Visual Question Answering (VQA) models and evaluate it on various compositional tests, and possibly create your own new compositional splits.
See our paper for details, and the dataset webpage for the leaderboard and other details.
The dataset is available at this link:
- Download dataset
- 23/3 - 1.0.1 - Added missing
paraphrased
field for validation and test sets
- 23/3 - 1.0.1 - Added missing
How to reproduce our results:
How to create new splits:
How to generate COVR: