Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper • 2412.18619 • Published 27 days ago • 52
UltraEdit: Instruction-based Fine-Grained Image Editing at Scale Paper • 2407.05282 • Published Jul 7, 2024 • 13
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Paper • 2309.14509 • Published Sep 25, 2023 • 17
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention Paper • 2309.14327 • Published Sep 25, 2023 • 21
DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention Paper • 2309.14327 • Published Sep 25, 2023 • 21