Priority Sampling of Large Language Models for Compilers Paper • 2402.18734 • Published Feb 28, 2024 • 16
Accelerating Large Language Model Decoding with Speculative Sampling Paper • 2302.01318 • Published Feb 2, 2023 • 2
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 4
AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling Paper • 2011.09011 • Published Nov 18, 2020 • 2