-
Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding,
arXiv, 2312.17044
, arxiv, pdf, cication: -1Liang Zhao, Xiaocheng Feng, Xiachong Feng, Bing Qin, Ting Liu · (jiqizhixin)
-
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey,
arXiv, 2311.12351
, arxiv, pdf, cication: -1Yunpeng Huang, Jingwei Xu, Zixu Jiang, Junyu Lai, Zenan Li, Yuan Yao, Taolue Chen, Lijuan Yang, Zhou Xin, Xiaoxing Ma · (long-llms-learning - Strivin0311)
-
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization,
arXiv, 2401.18079
, arxiv, pdf, cication: -1Coleman Hooper, Sehoon Kim, Hiva Mohammadzadeh, Michael W. Mahoney, Yakun Sophia Shao, Kurt Keutzer, Amir Gholami
-
Long-Context-Data-Engineering - FranxYao
Implementation of paper Data Engineering for Scaling Language Models to 128K Context
-
LongAlign: A Recipe for Long Context Alignment of Large Language Models,
arXiv, 2401.18058
, arxiv, pdf, cication: -1Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, Juanzi Li · (LongAlign - THUDM)
-
With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation,
arXiv, 2401.11504
, arxiv, pdf, cication: -1Y. Wang, D. Ma, D. Cai · (zhuanlan.zhihu)
-
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models,
arXiv, 2401.06951
, arxiv, pdf, cication: -1Jiaheng Liu, Zhiqi Bai, Yuanxing Zhang, Chenchen Zhang, Yu Zhang, Ge Zhang, Jiakai Wang, Haoran Que, Yukang Chen, Wenbo Su
-
Extending LLMs' Context Window with 100 Samples,
arXiv, 2401.07004
, arxiv, pdf, cication: -1Yikai Zhang, Junlong Li, Pengfei Liu · (Entropy-ABF - GAIR-NLP)
-
Transformers are Multi-State RNNs,
arXiv, 2401.06104
, arxiv, pdf, cication: -1Matanel Oren, Michael Hassid, Yossi Adi, Roy Schwartz · (TOVA - schwartz-lab-NLP)
-
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models,
arXiv, 2401.04658
, arxiv, pdf, cication: -1Zhen Qin, Weigao Sun, Dong Li, Xuyang Shen, Weixuan Sun, Yiran Zhong · (lightning-attention - OpenNLPLab)
-
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache,
arXiv, 2401.02669
, arxiv, pdf, cication: -1Bin Lin, Tao Peng, Chen Zhang, Minmin Sun, Lanbo Li, Hanyu Zhao, Wencong Xiao, Qi Xu, Xiafei Qiu, Shen Li
-
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning,
arXiv, 2401.01325
, arxiv, pdf, cication: -1Hongye Jin, Xiaotian Han, Jingfeng Yang, Zhimeng Jiang, Zirui Liu, Chia-Yuan Chang, Huiyuan Chen, Xia Hu · (qbitai)
-
Cached Transformers: Improving Transformers with Differentiable Memory Cache,
arXiv, 2312.12742
, arxiv, pdf, cication: -1Zhaoyang Zhang, Wenqi Shao, Yixiao Ge, Xiaogang Wang, Jinwei Gu, Ping Luo
-
Extending Context Window of Large Language Models via Semantic Compression,
arXiv, 2312.09571
, arxiv, pdf, cication: -1Weizhi Fei, Xueyan Niu, Pingyi Zhou, Lu Hou, Bo Bai, Lei Deng, Wei Han
-
Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention,
arXiv, 2312.08618
, arxiv, pdf, cication: -1Kaiqiang Song, Xiaoyang Wang, Sangwoo Cho, Xiaoman Pan, Dong Yu
-
Ultra-Long Sequence Distributed Transformer,
arXiv, 2311.02382
, arxiv, pdf, cication: -1Xiao Wang, Isaac Lyngaas, Aristeidis Tsaris, Peng Chen, Sajal Dash, Mayanka Chandra Shekar, Tao Luo, Hong-Jun Yoon, Mohamed Wahib, John Gouley
-
HyperAttention: Long-context Attention in Near-Linear Time,
arXiv, 2310.05869
, arxiv, pdf, cication: 2Insu Han, Rajesh Jayaram, Amin Karbasi, Vahab Mirrokni, David P. Woodruff, Amir Zandieh
-
CLEX: Continuous Length Extrapolation for Large Language Models,
arXiv, 2310.16450
, arxiv, pdf, cication: -1Guanzheng Chen, Xin Li, Zaiqiao Meng, Shangsong Liang, Lidong Bing
-
TRAMS: Training-free Memory Selection for Long-range Language Modeling,
arXiv, 2310.15494
, arxiv, pdf, cication: -1Haofei Yu, Cunxiang Wang, Yue Zhang, Wei Bi
-
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading,
arXiv, 2310.05029
, arxiv, pdf, cication: -1Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz · (mp.weixin.qq)
-
Scaling Laws of RoPE-based Extrapolation,
arXiv, 2310.05209
, arxiv, pdf, cication: -1Xiaoran Liu, Hang Yan, Shuo Zhang, Chenxin An, Xipeng Qiu, Dahua Lin · (qbitai)
-
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading,
arXiv, 2310.05029
, arxiv, pdf, cication: -1Howard Chen, Ramakanth Pasunuru, Jason Weston, Asli Celikyilmaz
-
Ring Attention with Blockwise Transformers for Near-Infinite Context,
arXiv, 2310.01889
, arxiv, pdf, cication: -1Hao Liu, Matei Zaharia, Pieter Abbeel
-
EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation,
arXiv, 2310.08185
, arxiv, pdf, cication: -1Wang You, Wenshan Wu, Yaobo Liang, Shaoguang Mao, Chenfei Wu, Maosong Cao, Yuzhe Cai, Yiduo Guo, Yan Xia, Furu Wei
-
CoCA: Fusing position embedding with Collinear Constrained Attention for fine-tuning free context window extending,
arXiv, 2309.08646
, arxiv, pdf, cication: -1Shiyi Zhu, Jing Ye, Wei Jiang, Qi Zhang, Yifan Wu, Jianguo Li · (Collinear-Constrained-Attention - codefuse-ai)
· (jiqizhixin)
-
Effective Long-Context Scaling of Foundation Models,
arXiv, 2309.16039
, arxiv, pdf, cication: 1Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz · (qbitai)
-
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models,
arXiv, 2308.16137
, arxiv, pdf, cication: 3Chi Han, Qifan Wang, Wenhan Xiong, Yu Chen, Heng Ji, Sinong Wang
-
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models,
arXiv, 2309.14509
, arxiv, pdf, cication: -1Sam Ade Jacobs, Masahiro Tanaka, Chengming Zhang, Minjia Zhang, Shuaiwen Leon Song, Samyam Rajbhandari, Yuxiong He
-
YaRN: Efficient Context Window Extension of Large Language Models,
arXiv, 2309.00071
, arxiv, pdf, cication: 9Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole · (yarn - jquesnelle)
· (jiqizhixin)
-
In-context Autoencoder for Context Compression in a Large Language Model,
arXiv, 2307.06945
, arxiv, pdf, cication: 4Tao Ge, Jing Hu, Lei Wang, Xun Wang, Si-Qing Chen, Furu Wei
-
Focused Transformer: Contrastive Training for Context Scaling,
arXiv, 2307.03170
, arxiv, pdf, cication: 12Szymon Tworkowski, Konrad Staniszewski, Mikołaj Pacek, Yuhuai Wu, Henryk Michalewski, Piotr Miłoś
-
Lost in the Middle: How Language Models Use Long Contexts,
arXiv, 2307.03172
, arxiv, pdf, cication: 64Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, Percy Liang
-
LongNet: Scaling Transformers to 1,000,000,000 Tokens,
arXiv, 2307.02486
, arxiv, pdf, cication: 15Jiayu Ding, Shuming Ma, Li Dong, Xingxing Zhang, Shaohan Huang, Wenhui Wang, Nanning Zheng, Furu Wei
-
Extending Context Window of Large Language Models via Positional Interpolation,
arXiv, 2306.15595
, arxiv, pdf, cication: 36Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian · (qbitai)
-
The Impact of Positional Encoding on Length Generalization in Transformers,
arXiv, 2305.19466
, arxiv, pdf, cication: 5Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
-
Long-range Language Modeling with Self-retrieval,
arXiv, 2306.13421
, arxiv, pdf, cication: 3Ohad Rubin, Jonathan Berant
-
Block-State Transformers,
arXiv, 2306.09539
, arxiv, pdf, cication: 2Mahan Fathi, Jonathan Pilault, Orhan Firat, Christopher Pal, Pierre-Luc Bacon, Ross Goroshin
-
LeanDojo: Theorem Proving with Retrieval-Augmented Language Models,
arXiv, 2306.15626
, arxiv, pdf, cication: 14Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar
-
GLIMMER: generalized late-interaction memory reranker,
arXiv, 2306.10231
, arxiv, pdf, cication: 1Michiel de Jong, Yury Zemlyanskiy, Nicholas FitzGerald, Sumit Sanghai, William W. Cohen, Joshua Ainslie
-
Augmenting Language Models with Long-Term Memory,
arXiv, 2306.07174
, arxiv, pdf, cication: 7Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei · (aka)
-
Sequence Parallelism: Long Sequence Training from System Perspective,
arXiv, 2105.13120
, arxiv, pdf, cication: 2Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You
-
LLMLingua - microsoft
To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
-
long-context - abacusai
This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and benchmark tasks that evaluate a model’s information retrieval capabilities with context expansion. We also include key experimental results and instructions for reproducing and building on them.
-
long_llama - cstankonrad
LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.