Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add fused swiglu for MLP
#1536 opened Apr 15, 2025 by michal2409 Loading…
CUDA graph fixes for Llama3.1
#1534 opened Apr 14, 2025 by vasunvidia Loading…
Fix AttributeError in MultiTokenPredictionLayer
#1529 opened Apr 12, 2025 by shenyunhang Loading…
Fix typo on distrib_optimizer.py
#1505 opened Mar 26, 2025 by wplf Loading…
fix: MultiLatentAttention cp_comm_type
#1499 opened Mar 24, 2025 by RandMist Loading…
Fix llama_mistral loader by using args.true_vocab_size
#1491 opened Mar 20, 2025 by zhuzilin Loading…
vscode/cursor devcontainer
#1483 opened Mar 14, 2025 by yzhang123 Loading…
Set hashlib.md5 usedforsecurity=False, #1471
#1472 opened Mar 12, 2025 by jsta Loading…
Draft: Youngeun/a2a hiding
#1460 opened Mar 10, 2025 by lhb8125 Loading…
[ENHANCEMENT] add z-loss (improved version)
#1442 opened Feb 28, 2025 by wdevazelhes Loading…
fix seq_aux_loss for DeepSeek-V3
#1439 opened Feb 27, 2025 by yzlnew Loading…
a proof of concept for Distributed Muon
#1428 opened Feb 24, 2025 by toothacher17 Loading…
Fix document regarding GQA (--group-query-attention) argument stale No activity in 60 days on issue or PR
#1401 opened Feb 12, 2025 by eagle705 Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.