An Empirical Study of Autoregressive Pre-training from Videos Paper • 2501.05453 • Published 2 days ago • 25
MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting Paper • 2501.03714 • Published 4 days ago • 7
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 4 days ago • 16
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 9 days ago • 10
PERSE: Personalized 3D Generative Avatars from A Single Portrait Paper • 2412.21206 • Published 12 days ago • 15
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 27 days ago • 52
DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation Paper • 2412.15200 • Published 23 days ago • 9
Autoregressive Video Generation without Vector Quantization Paper • 2412.14169 • Published 24 days ago • 14
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints Paper • 2412.07760 • Published Dec 10, 2024 • 50
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes Paper • 2412.11457 • Published 27 days ago • 5
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published 30 days ago • 20
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction Paper • 2412.09573 • Published 30 days ago • 7
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published Dec 11, 2024 • 38
ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer Paper • 2412.07720 • Published Dec 10, 2024 • 30
MotionShop: Zero-Shot Motion Transfer in Video Diffusion Models with Mixture of Score Guidance Paper • 2412.05355 • Published Dec 6, 2024 • 7
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 124
Improved Distribution Matching Distillation for Fast Image Synthesis Paper • 2405.14867 • Published May 23, 2024 • 12
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment Paper • 2412.04814 • Published Dec 6, 2024 • 45
AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models Paper • 2412.04146 • Published Dec 5, 2024 • 22