CompareBench

CompareBench is a benchmark for evaluating visual comparison reasoning in vision-language models (VLMs), covering four tasks: quantity, temporal, geometric, and spatial. It is derived from two auxiliary datasets:

The benchmark dataset is available here:

CompareBench (1,200 QA pairs)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
prompts.yaml		prompts.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CompareBench

Contents

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

CompareBench

Contents

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages