Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 23 days ago • 25
GenEx: Generating an Explorable World Paper • 2412.09624 • Published about 1 month ago • 88 • 2
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark Paper • 2412.07825 • Published Dec 10, 2024 • 12
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark Paper • 2412.07825 • Published Dec 10, 2024 • 12
ViTamin: Designing Scalable Vision Models in the Vision-Language Era Paper • 2404.02132 • Published Apr 2, 2024 • 2