Skip to content

Latest commit

 

History

History
501 lines (348 loc) · 46.1 KB

awesome_llm_misc.md

File metadata and controls

501 lines (348 loc) · 46.1 KB

Awesome llm misc

Survey

  • From Google Gemini to OpenAI Q (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape*, arXiv, 2312.10868, arxiv, pdf, cication: -1

    Timothy R. McIntosh, Teo Susnjak, Tong Liu, Paul Watters, Malka N. Halgamuge

  • A Survey of Large Language Models Attribution, arXiv, 2311.03731, arxiv, pdf, cication: -1

    Dongfang Li, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Ziyang Chen, Baotian Hu, Aiguo Wu, Min Zhang · (awesome-llm-attributions - HITsz-TMG) Star

  • On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models, arXiv, 2307.09793, arxiv, pdf, cication: 1

    Sarah Gao, Andrew Kean Gao · (constellation.sites.stanford)

  • A Survey of Large Language Models, arXiv, 2303.18223, arxiv, pdf, cication: 285

    Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong · (LLMSurvey - RUCAIBox) Star

Blogs

Toolkits

  • amazing-openai-api - soulteary Star

    Convert different model APIs into the OpenAI API format out of the box.

  • jan - janhq Star

    Jan is an open source alternative to ChatGPT that runs 100% offline on your computer

  • GPT_API_free - chatanywhere Star

    Free ChatGPT API Key,免费ChatGPT API,支持GPT4 API(免费),ChatGPT国内可用免费转发API,直连无需代理。可以搭配ChatBox等软件/插件使用,极大降低接口使用成本。国内即可无限制畅快聊天。

  • BricksLLM - bricks-cloud Star

    Simplifying LLM ops in production

  • skypilot - skypilot-org Star

    SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.

  • vllm - vllm-project Star

    A high-throughput and memory-efficient inference and serving engine for LLMs

  • langflow - logspace-ai Star

    ⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.

  • torchscale - microsoft Star

    Foundation Architecture for (M)LLMs

  • LLM-As-Chatbot - deep-diver Star

    LLM as a Chatbot Service

  • Llama-2-Open-Source-LLM-CPU-Inference - kennethleungty Star

    Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A

  • ollama - jmorganca Star

    Get up and running with large language models locally

  • OpenLLM - bentoml Star

    An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.

  • litellm - BerriAI Star

    Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

  • ollama - jmorganca Star

    Get up and running with Llama 2 and other large language models locally

  • gpu_poor - RahulSChand Star

    Calculate GPU memory requirement & breakdown for training/inference of LLM models. Supports ggml/bnb quantization

  • leptonai - leptonai Star

    A Pythonic framework to simplify AI service building

  • exllamav2 - turboderp Star

    A fast inference library for running LLMs locally on modern consumer-class GPUs

  • outlines - normal-computing Star

    Generative Model Programming

  • one-api - songquanpeng Star

    OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2、智谱 ChatGLM、百度文心一言、讯飞星火认知以及阿里通义千问,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.

  • LLaMA2-Accessory - Alpha-VLLM Star

    An Open-source Toolkit for LLM Development

  • Flowise - FlowiseAI Star

    Drag & drop UI to build your customized LLM flow

  • simpleaichat - minimaxir Star

    Python package for easily interfacing with chat apps, with robust features and minimal code complexity.

  • TypeChat - Microsoft Star

    TypeChat is a library that makes it easy to build natural language interfaces using types.

  • petals - bigscience-workshop Star

    🌸 Run large language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

  • chatbox - Bin-Huang Star

    Chatbox is a desktop app for GPT/LLM that supports Windows, Mac, Linux & Web Online

  • h2o-llmstudio - h2oai Star

    H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs

  • LMFlow - OptimalScale Star

    An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.

  • FlagAI - FlagAI-Open Star

    FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Unlearning

  • TOFU: A Task of Fictitious Unlearning for LLMs, arXiv, 2401.06121, arxiv, pdf, cication: -1

    Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter

  • Large Language Model Unlearning, arXiv, 2310.10683, arxiv, pdf, cication: -1

    Yuanshun Yao, Xiaojun Xu, Yang Liu

    · (jiqizhixin) · (llm_unlearn - kevinyaobytedance) Star

  • Improving Language Plasticity via Pretraining with Active Forgetting, arXiv, 2307.01163, arxiv, pdf, cication: -1

    Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, Mikel Artetxe

  • Announcing the first Machine Unlearning Challenge – Google Research Blog

Personality

  • Large Language Models Understand and Can be Enhanced by Emotional Stimuli, arXiv, 2307.11760, arxiv, pdf, cication: 6

    Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie

  • When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities, arXiv, 2307.16376, arxiv, pdf, cication: 7

    Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang

  • Personality Traits in Large Language Models, arXiv, 2307.00184, arxiv, pdf, cication: 17

    Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić

World Model

Red teaming (safety)

  • MART: Improving LLM Safety with Multi-round Automatic Red-Teaming, arXiv, 2311.07689, arxiv, pdf, cication: -1

    Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao

  • Moral Foundations of Large Language Models, arXiv, 2310.15337, arxiv, pdf, cication: 7

    Marwa Abdulhai, Gregory Serapio-Garcia, Clément Crepy, Daria Valter, John Canny, Natasha Jaques

  • FLIRT: Feedback Loop In-context Red Teaming, arXiv, 2308.04265, arxiv, pdf, cication: 3

    Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

  • Explore, Establish, Exploit: Red Teaming Language Models from Scratch, arXiv, 2306.09442, arxiv, pdf, cication: 16

    Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell

Forecasting

  • Time-LLM: Time Series Forecasting by Reprogramming Large Language Models, arXiv, 2310.01728, arxiv, pdf, cication: 17

    Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan · (time-llm - kimmeen) Star

Chat arena

  • GodMode - smol-ai Star

    AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2! I use this 20 times a day.

  • ChatALL - sunner Star

    Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers

New model

  • BlackMamba — Zyphra

    · (static1.squarespace)

  • Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens, arXiv, 2401.17377, arxiv, pdf, cication: -1

    Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi

  • Transfer Learning for Text Diffusion Models, arXiv, 2401.17181, arxiv, pdf, cication: -1

    Kehang Han, Kathleen Kenealy, Aditya Barua, Noah Fiedel, Noah Constant

  • 🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)

  • MambaByte: Token-free Selective State Space Model, arXiv, 2401.13660, arxiv, pdf, cication: -1

    Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M Rush

  • MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts, arXiv, 2401.04081, arxiv, pdf, cication: -1

    Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Sebastian Jaszczur

  • The Annotated S4

  • Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers

  • Mamba: Linear-Time Sequence Modeling with Selective State Spaces, arXiv, 2312.00752, arxiv, pdf, cication: -1

    Albert Gu, Tri Dao · (mamba - state-spaces) Star

  • TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing, arXiv, 2312.05605, arxiv, pdf, cication: -1

    Aleksandar Terzic, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi

  • GIVT: Generative Infinite-Vocabulary Transformers, arXiv, 2312.02116, arxiv, pdf, cication: -1

    Michael Tschannen, Cian Eastwood, Fabian Mentzer

  • Text Rendering Strategies for Pixel Language Models, arXiv, 2311.00522, arxiv, pdf, cication: -1

    Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott

  • Retentive Network: A Successor to Transformer for Large Language Models, arXiv, 2307.08621, arxiv, pdf, cication: 14

    Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei

  • Copy Is All You Need, arXiv, 2307.06962, arxiv, pdf, cication: 217

    Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao

  • BiPhone: Modeling Inter Language Phonetic Influences in Text, arXiv, 2307.03322, arxiv, pdf, cication: -1

    Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James S. Ren, Ambarish Jash, Sukhdeep S. Sodhi, Aravindan Raghuveer

  • Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference, arXiv, 2306.12509, arxiv, pdf, cication: 4

    Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux

  • Backpack Language Models, arXiv, 2305.16765, arxiv, pdf, cication: 4

    John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang · (jiqizhixin) · (mp.weixin.qq)

LLM detection

  • HuRef: HUman-REadable Fingerprint for Large Language Models, arXiv, 2312.04828, arxiv, pdf, cication: -1

    Boyi Zeng, Chenghu Zhou, Xinbing Wang, Zhouhan Lin · (jiqizhixin)

  • LLM-generated-text-detection - thunlp Star

  • Adaptive Text Watermark for Large Language Models, arXiv, 2401.13927, arxiv, pdf, cication: -1

    Yepeng Liu, Yuheng Bu

  • Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text, arXiv, 2401.12070, arxiv, pdf, cication: -1

    Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

  • LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase, arXiv, 2401.05952, arxiv, pdf, cication: -1

    Chujie Gao, Dongping Chen, Qihui Zhang, Yue Huang, Yao Wan, Lichao Sun · (MixSet - Dongping-Chen) Star

  • A Survey of Text Watermarking in the Era of Large Language Models, arXiv, 2312.07913, arxiv, pdf, cication: -1

    Aiwei Liu, Leyi Pan, Yijian Lu, Jingjing Li, Xuming Hu, Xi Zhang, Lijie Wen, Irwin King, Hui Xiong, Philip S. Yu · (jiqizhixin)

  • Ghostbuster: Detecting Text Ghostwritten by Large Language Models, arXiv, 2305.15047, arxiv, pdf, cication: 6

    Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein · (bair.berkeley)

  • ‘ChatGPT detector’ catches AI-generated papers with unprecedented accuracy

  • GPT detectors are biased against non-native English writers, arXiv, 2304.02819, arxiv, pdf, cication: 42

    Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou

  • Three Bricks to Consolidate Watermarks for Large Language Models, arXiv, 2308.00113, arxiv, pdf, cication: 3

    Pierre Fernandez, Antoine Chaffin, Karim Tit, Vivien Chappelier, Teddy Furon

  • Robust Distortion-free Watermarks for Language Models, arXiv, 2307.15593, arxiv, pdf, cication: 9

    Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, Percy Liang

  • Can AI-Generated Text be Reliably Detected?, arXiv, 2303.11156, arxiv, pdf, cication: 93

    Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi · (mp.weixin.qq)

  • Digital tool spots academic text spawned by ChatGPT with 99% accuracy | The University of Kansas

    · (mp.weixin.qq)

Interpretability

  • Can Large Language Models Understand Context?, arXiv, 2402.00858, arxiv, pdf, cication: -1

    Yilun Zhu, Joel Ruben Antony Moniz, Shruti Bhargava, Jiarui Lu, Dhivya Piraviperumal, Site Li, Yuan Zhang, Hong Yu, Bo-Hsiang Tseng

  • Circuits Updates - January 2024

  • Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models, arXiv, 2401.06102, arxiv, pdf, cication: -1

    Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, Mor Geva

  • Vayu Robotics Blog - Interpretable End-to-End Robot Navigation

  • Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1) — AI Alignment Forum

  • deep learning does approximate Solomonoff induction

  • awesome-llm-interpretability - JShollaj Star

    A curated list of Large Language Model (LLM) Interpretability resources.

  • Site Unreachable

  • Challenges with unsupervised LLM knowledge discovery, arXiv, 2312.10029, arxiv, pdf, cication: -1

    Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

  • Using Captum to Explain Generative Language Models, arXiv, 2312.05491, arxiv, pdf, cication: -1

    Vivek Miglani, Aobo Yang, Aram H. Markosyan, Diego Garcia-Olano, Narine Kokhlikyan

  • Beyond Surface: Probing LLaMA Across Scales and Layers, arXiv, 2312.04333, arxiv, pdf, cication: -1

    Nuo Chen, Ning Wu, Shining Liang, Ming Gong, Linjun Shou, Dongmei Zhang, Jia Li

  • llm-viz - bbycroft Star

    3D Visualization of an GPT-style LLM

  • White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?, arXiv, 2311.13110, arxiv, pdf, cication: -1

    Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma · (mp.weixin.qq)

  • Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models, arXiv, 2311.00871, arxiv, pdf, cication: -1

    Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni · (jiqizhixin)

  • The Generative AI Paradox: "What It Can Create, It May Not Understand", arXiv, 2311.00059, arxiv, pdf, cication: -1

    Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu

  • The Impact of Depth and Width on Transformer Language Model Generalization, arXiv, 2310.19956, arxiv, pdf, cication: -1

    Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen

  • Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations, arXiv, 2310.11207, arxiv, pdf, cication: -1

    Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, Leilani H. Gilpin

  • Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

    · (qbitai)

  • Representation Engineering: A Top-Down Approach to AI Transparency, arXiv, 2310.01405, arxiv, pdf, cication: 5

    Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski · (representation-engineering - andyzoujm) Star · (mp.weixin.qq)

  • Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models, arXiv, 2309.15098, arxiv, pdf, cication: -1

    Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

  • Language Modeling Is Compression, arXiv, 2309.10668, arxiv, pdf, cication: 7

    Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau

  • Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT), arXiv, 2309.08968, arxiv, pdf, cication: -1

    Parsa Kavehzadeh, Mojtaba Valipour, Marzieh Tahaei, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh

  • Sparse Autoencoders Find Highly Interpretable Features in Language Models, arXiv, 2309.08600, arxiv, pdf, cication: 5

    Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, Lee Sharkey

  • Human Language Understanding & Reasoning

    · (mp.weixin.qq)

  • Do Machine Learning Models Memorize or Generalize?

    · (mp.weixin.qq)

  • CIMI - Daftstone Star

    · (jiqizhixin)

  • Do Machine Learning Models Memorize or Generalize?

    · (qbitai)

  • Studying Large Language Model Generalization with Influence Functions, arXiv, 2308.03296, arxiv, pdf, cication: 12

    Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez

  • Can foundation models label data like humans?

  • Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer, arXiv, 2305.16380, arxiv, pdf, cication: 6

    Yuandong Tian, Yiping Wang, Beidi Chen, Simon Du · (mp.weixin.qq)

  • Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning, arXiv, 2305.14160, arxiv, pdf, cication: -1

    Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun · (qbitai)

Generaliazation

  • Time is Encoded in the Weights of Finetuned Language Models, arXiv, 2312.13401, arxiv, pdf, cication: -1

    Kai Nylund, Suchin Gururangan, Noah A. Smith

  • Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models, arXiv, 2311.00871, arxiv, pdf, cication: -1

    Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni

LLM editting

  • A Comprehensive Study of Knowledge Editing for Large Language Models, arXiv, 2401.01286, arxiv, pdf, cication: -1

    Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni

  • Evaluating the Ripple Effects of Knowledge Editing in Language Models, arXiv, 2307.12976, arxiv, pdf, cication: 5

    Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva

  • Editing Large Language Models: Problems, Methods, and Opportunities, arXiv, 2305.13172, arxiv, pdf, cication: 12

    Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang · (easyedit - zjunlp) Star

  • ModelEditingPapers - zjunlp Star

    Must-read Papers on Model Editing.

AGI insights

Callibration

  • Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation, arXiv, 2311.08877, arxiv, pdf, cication: -1

    Vaishnavi Shrivastava, Percy Liang, Ananya Kumar

  • Do Large Language Models Know What They Don't Know?, arXiv, 2305.18153, arxiv, pdf, cication: 16

    Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang

Books

Privacy

Misc

Impacts

Course & Tutorial

CUDA

Extra reference