Awesome llm misc

Awesome llm misc
- Survey
- Toolkits
- Unlearning
- Personality
- World Model
- Red teaming
- Chat arena
- New launguage model
- LLM detection
- Explanation
- Generaliazation
- LLM editting
- AGI insights
- Callibration
- Books
- Privacy
- Misc
- Extra reference

Survey

From Google Gemini to OpenAI Q (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape*, arXiv, 2312.10868, arxiv, pdf, cication: -1

Timothy R. McIntosh, Teo Susnjak, Tong Liu, Paul Watters, Malka N. Halgamuge
A Survey of Large Language Models Attribution, arXiv, 2311.03731, arxiv, pdf, cication: -1

Dongfang Li, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Ziyang Chen, Baotian Hu, Aiguo Wu, Min Zhang · (awesome-llm-attributions - HITsz-TMG)
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models, arXiv, 2307.09793, arxiv, pdf, cication: 1

Sarah Gao, Andrew Kean Gao · (constellation.sites.stanford)
A Survey of Large Language Models, arXiv, 2303.18223, arxiv, pdf, cication: 285

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong · (LLMSurvey - RUCAIBox)

Blogs

2023, year of open LLMs
Research Papers in November 2023
AI and Open Source in 2023 - by Sebastian Raschka, PhD
The History of Open-Source LLMs: Imitation and Alignment (Part Three)
Research Papers (October 2023) - by Sebastian Raschka, PhD
A Survey of Techniques for Maximizing LLM Performance
Transformer Taxonomy (the last lit review) | kipply's blog

· (jiqizhixin)
Catching up on the weird world of LLMs

Toolkits

amazing-openai-api - soulteary

Convert different model APIs into the OpenAI API format out of the box.
jan - janhq

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer
GPT_API_free - chatanywhere

Free ChatGPT API Key，免费ChatGPT API，支持GPT4 API（免费），ChatGPT国内可用免费转发API，直连无需代理。可以搭配ChatBox等软件/插件使用，极大降低接口使用成本。国内即可无限制畅快聊天。
BricksLLM - bricks-cloud

Simplifying LLM ops in production
skypilot - skypilot-org

SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
vllm - vllm-project

A high-throughput and memory-efficient inference and serving engine for LLMs
langflow - logspace-ai

⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.
torchscale - microsoft

Foundation Architecture for (M)LLMs
LLM-As-Chatbot - deep-diver

LLM as a Chatbot Service
Llama-2-Open-Source-LLM-CPU-Inference - kennethleungty

Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
ollama - jmorganca

Get up and running with large language models locally
OpenLLM - bentoml

An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.
litellm - BerriAI

Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
ollama - jmorganca

Get up and running with Llama 2 and other large language models locally
gpu_poor - RahulSChand

Calculate GPU memory requirement & breakdown for training/inference of LLM models. Supports ggml/bnb quantization
leptonai - leptonai

A Pythonic framework to simplify AI service building
exllamav2 - turboderp

A fast inference library for running LLMs locally on modern consumer-class GPUs
outlines - normal-computing

Generative Model Programming
one-api - songquanpeng

OpenAI 接口管理 & 分发系统，支持 Azure、Anthropic Claude、Google PaLM 2、智谱 ChatGLM、百度文心一言、讯飞星火认知以及阿里通义千问，可用于二次分发管理 key，仅单可执行文件，已打包好 Docker 镜像，一键部署，开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
LLaMA2-Accessory - Alpha-VLLM

An Open-source Toolkit for LLM Development
Flowise - FlowiseAI

Drag & drop UI to build your customized LLM flow
simpleaichat - minimaxir

Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
TypeChat - Microsoft

TypeChat is a library that makes it easy to build natural language interfaces using types.
petals - bigscience-workshop

🌸 Run large language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
chatbox - Bin-Huang

Chatbox is a desktop app for GPT/LLM that supports Windows, Mac, Linux & Web Online
h2o-llmstudio - h2oai

H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs
LMFlow - OptimalScale

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.
FlagAI - FlagAI-Open

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.

Unlearning

TOFU: A Task of Fictitious Unlearning for LLMs, arXiv, 2401.06121, arxiv, pdf, cication: -1

Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter
Large Language Model Unlearning, arXiv, 2310.10683, arxiv, pdf, cication: -1

Yuanshun Yao, Xiaojun Xu, Yang Liu

· (jiqizhixin) · (llm_unlearn - kevinyaobytedance)
Improving Language Plasticity via Pretraining with Active Forgetting, arXiv, 2307.01163, arxiv, pdf, cication: -1

Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, Mikel Artetxe
Announcing the first Machine Unlearning Challenge – Google Research Blog

Personality

Large Language Models Understand and Can be Enhanced by Emotional Stimuli, arXiv, 2307.11760, arxiv, pdf, cication: 6

Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie
When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities, arXiv, 2307.16376, arxiv, pdf, cication: 7

Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang
Personality Traits in Large Language Models, arXiv, 2307.00184, arxiv, pdf, cication: 17

Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić

World Model

The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets, arXiv, 2310.06824, arxiv, pdf, cication: -1

Samuel Marks, Max Tegmark · (mp.weixin.qq)
Language Models Represent Space and Time, arXiv, 2310.02207, arxiv, pdf, cication: 2

Wes Gurnee, Max Tegmark · (world-models - wesg52)
How far are we from AGI?

· (mp.weixin.qq)
OpenAI「登月计划」剑指超级AI！LeCun提出AGI之路七阶段，打造世界模型是首位 https://mp.weixin.qq.com/s?__biz=MzI3MTA0MTk1MA==&mid=2652419855&idx=3&sn=a5f20e0a0061c01e1ec87d4a81c23e68

Red teaming (safety)

MART: Improving LLM Safety with Multi-round Automatic Red-Teaming, arXiv, 2311.07689, arxiv, pdf, cication: -1

Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao
Moral Foundations of Large Language Models, arXiv, 2310.15337, arxiv, pdf, cication: 7

Marwa Abdulhai, Gregory Serapio-Garcia, Clément Crepy, Daria Valter, John Canny, Natasha Jaques
FLIRT: Feedback Loop In-context Red Teaming, arXiv, 2308.04265, arxiv, pdf, cication: 3

Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
Explore, Establish, Exploit: Red Teaming Language Models from Scratch, arXiv, 2306.09442, arxiv, pdf, cication: 16

Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell

Forecasting

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models, arXiv, 2310.01728, arxiv, pdf, cication: 17

Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan · (time-llm - kimmeen)

Chat arena

GodMode - smol-ai

AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2! I use this 20 times a day.
ChatALL - sunner

Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers

New model

BlackMamba — Zyphra

· (static1.squarespace)
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens, arXiv, 2401.17377, arxiv, pdf, cication: -1

Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi
Transfer Learning for Text Diffusion Models, arXiv, 2401.17181, arxiv, pdf, cication: -1

Kehang Han, Kathleen Kenealy, Aditya Barua, Noah Fiedel, Noah Constant
🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)
MambaByte: Token-free Selective State Space Model, arXiv, 2401.13660, arxiv, pdf, cication: -1

Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M Rush
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts, arXiv, 2401.04081, arxiv, pdf, cication: -1

Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Sebastian Jaszczur
The Annotated S4
Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers
Mamba: Linear-Time Sequence Modeling with Selective State Spaces, arXiv, 2312.00752, arxiv, pdf, cication: -1

Albert Gu, Tri Dao · (mamba - state-spaces)
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing, arXiv, 2312.05605, arxiv, pdf, cication: -1

Aleksandar Terzic, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi
GIVT: Generative Infinite-Vocabulary Transformers, arXiv, 2312.02116, arxiv, pdf, cication: -1

Michael Tschannen, Cian Eastwood, Fabian Mentzer
Text Rendering Strategies for Pixel Language Models, arXiv, 2311.00522, arxiv, pdf, cication: -1

Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott
Retentive Network: A Successor to Transformer for Large Language Models, arXiv, 2307.08621, arxiv, pdf, cication: 14

Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei
Copy Is All You Need, arXiv, 2307.06962, arxiv, pdf, cication: 217

Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao
BiPhone: Modeling Inter Language Phonetic Influences in Text, arXiv, 2307.03322, arxiv, pdf, cication: -1

Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James S. Ren, Ambarish Jash, Sukhdeep S. Sodhi, Aravindan Raghuveer
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference, arXiv, 2306.12509, arxiv, pdf, cication: 4

Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux
Backpack Language Models, arXiv, 2305.16765, arxiv, pdf, cication: 4

John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang · (jiqizhixin) · (mp.weixin.qq)

LLM detection

HuRef: HUman-REadable Fingerprint for Large Language Models, arXiv, 2312.04828, arxiv, pdf, cication: -1

Boyi Zeng, Chenghu Zhou, Xinbing Wang, Zhouhan Lin · (jiqizhixin)
LLM-generated-text-detection - thunlp
Adaptive Text Watermark for Large Language Models, arXiv, 2401.13927, arxiv, pdf, cication: -1

Yepeng Liu, Yuheng Bu
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text, arXiv, 2401.12070, arxiv, pdf, cication: -1

Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase, arXiv, 2401.05952, arxiv, pdf, cication: -1

Chujie Gao, Dongping Chen, Qihui Zhang, Yue Huang, Yao Wan, Lichao Sun · (MixSet - Dongping-Chen)
A Survey of Text Watermarking in the Era of Large Language Models, arXiv, 2312.07913, arxiv, pdf, cication: -1

Aiwei Liu, Leyi Pan, Yijian Lu, Jingjing Li, Xuming Hu, Xi Zhang, Lijie Wen, Irwin King, Hui Xiong, Philip S. Yu · (jiqizhixin)
Ghostbuster: Detecting Text Ghostwritten by Large Language Models, arXiv, 2305.15047, arxiv, pdf, cication: 6

Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein · (bair.berkeley)
‘ChatGPT detector’ catches AI-generated papers with unprecedented accuracy
GPT detectors are biased against non-native English writers, arXiv, 2304.02819, arxiv, pdf, cication: 42

Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
Three Bricks to Consolidate Watermarks for Large Language Models, arXiv, 2308.00113, arxiv, pdf, cication: 3

Pierre Fernandez, Antoine Chaffin, Karim Tit, Vivien Chappelier, Teddy Furon
Robust Distortion-free Watermarks for Language Models, arXiv, 2307.15593, arxiv, pdf, cication: 9

Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, Percy Liang
Can AI-Generated Text be Reliably Detected?, arXiv, 2303.11156, arxiv, pdf, cication: 93

Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi · (mp.weixin.qq)
Digital tool spots academic text spawned by ChatGPT with 99% accuracy | The University of Kansas

· (mp.weixin.qq)

Interpretability

Can Large Language Models Understand Context?, arXiv, 2402.00858, arxiv, pdf, cication: -1

Yilun Zhu, Joel Ruben Antony Moniz, Shruti Bhargava, Jiarui Lu, Dhivya Piraviperumal, Site Li, Yuan Zhang, Hong Yu, Bo-Hsiang Tseng
Circuits Updates - January 2024
Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models, arXiv, 2401.06102, arxiv, pdf, cication: -1

Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, Mor Geva
Vayu Robotics Blog - Interpretable End-to-End Robot Navigation
Fact Finding: Attempting to Reverse-Engineer Factual Recall on the Neuron Level (Post 1) — AI Alignment Forum
deep learning does approximate Solomonoff induction
awesome-llm-interpretability - JShollaj

A curated list of Large Language Model (LLM) Interpretability resources.
Site Unreachable
Challenges with unsupervised LLM knowledge discovery, arXiv, 2312.10029, arxiv, pdf, cication: -1

Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah
Using Captum to Explain Generative Language Models, arXiv, 2312.05491, arxiv, pdf, cication: -1

Vivek Miglani, Aobo Yang, Aram H. Markosyan, Diego Garcia-Olano, Narine Kokhlikyan
Beyond Surface: Probing LLaMA Across Scales and Layers, arXiv, 2312.04333, arxiv, pdf, cication: -1

Nuo Chen, Ning Wu, Shining Liang, Ming Gong, Linjun Shou, Dongmei Zhang, Jia Li
llm-viz - bbycroft

3D Visualization of an GPT-style LLM
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?, arXiv, 2311.13110, arxiv, pdf, cication: -1

Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma · (mp.weixin.qq)
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models, arXiv, 2311.00871, arxiv, pdf, cication: -1

Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni · (jiqizhixin)
The Generative AI Paradox: "What It Can Create, It May Not Understand", arXiv, 2311.00059, arxiv, pdf, cication: -1

Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu
The Impact of Depth and Width on Transformer Language Model Generalization, arXiv, 2310.19956, arxiv, pdf, cication: -1

Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations, arXiv, 2310.11207, arxiv, pdf, cication: -1

Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, Leilani H. Gilpin
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

· (qbitai)
Representation Engineering: A Top-Down Approach to AI Transparency, arXiv, 2310.01405, arxiv, pdf, cication: 5

Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski · (representation-engineering - andyzoujm) · (mp.weixin.qq)
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models, arXiv, 2309.15098, arxiv, pdf, cication: -1

Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi
Language Modeling Is Compression, arXiv, 2309.10668, arxiv, pdf, cication: 7

Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT), arXiv, 2309.08968, arxiv, pdf, cication: -1

Parsa Kavehzadeh, Mojtaba Valipour, Marzieh Tahaei, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh
Sparse Autoencoders Find Highly Interpretable Features in Language Models, arXiv, 2309.08600, arxiv, pdf, cication: 5

Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, Lee Sharkey
Human Language Understanding & Reasoning

· (mp.weixin.qq)
Do Machine Learning Models Memorize or Generalize?

· (mp.weixin.qq)
CIMI - Daftstone

· (jiqizhixin)
Do Machine Learning Models Memorize or Generalize?

· (qbitai)
Studying Large Language Model Generalization with Influence Functions, arXiv, 2308.03296, arxiv, pdf, cication: 12

Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez
Can foundation models label data like humans?
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer, arXiv, 2305.16380, arxiv, pdf, cication: 6

Yuandong Tian, Yiping Wang, Beidi Chen, Simon Du · (mp.weixin.qq)
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning, arXiv, 2305.14160, arxiv, pdf, cication: -1

Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun · (qbitai)

Generaliazation

Time is Encoded in the Weights of Finetuned Language Models, arXiv, 2312.13401, arxiv, pdf, cication: -1

Kai Nylund, Suchin Gururangan, Noah A. Smith
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models, arXiv, 2311.00871, arxiv, pdf, cication: -1

Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni

LLM editting

A Comprehensive Study of Knowledge Editing for Large Language Models, arXiv, 2401.01286, arxiv, pdf, cication: -1

Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni
Evaluating the Ripple Effects of Knowledge Editing in Language Models, arXiv, 2307.12976, arxiv, pdf, cication: 5

Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva
Editing Large Language Models: Problems, Methods, and Opportunities, arXiv, 2305.13172, arxiv, pdf, cication: 12

Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang · (easyedit - zjunlp)
ModelEditingPapers - zjunlp

Must-read Papers on Model Editing.

AGI insights

Self-driving as a case study for AGI
Perspectives on the State and Future of Deep Learning -- 2023, arXiv, 2312.09323, arxiv, pdf, cication: -1

Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson
AI and Open Source in 2023 - by Sebastian Raschka, PhD

· (mp.weixin.qq)
Some intuitions about large language models
Role play with large language models | Nature

· (qbitai)
Levels of AGI: Operationalizing Progress on the Path to AGI, arXiv, 2311.02462, arxiv, pdf, cication: -1

Meredith Ringel Morris, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness, arXiv, 2308.08708, arxiv, pdf, cication: 15

Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji · (jiqizhixin)
Collective Intelligence for Deep Learning: A Survey of Recent Developments | 大トロ
好问题比好答案更重要｜沈向洋大模型五问

Callibration

Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation, arXiv, 2311.08877, arxiv, pdf, cication: -1

Vaishnavi Shrivastava, Percy Liang, Ananya Kumar
Do Large Language Models Know What They Don't Know?, arXiv, 2305.18153, arxiv, pdf, cication: 16

Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang

Books

大规模语言模型：从理论到实践

Privacy

PIISA

Misc

Prompt2Model: Generating Deployable Models from Natural Language Instructions, arXiv, 2308.12261, arxiv, pdf, cication: -1

Vijay Viswanathan, Chenyang Zhao, Amanda Bertsch, Tongshuang Wu, Graham Neubig · (mp.weixin.qq)
xVal: A Continuous Number Encoding for Large Language Models, arXiv, 2310.02989, arxiv, pdf, cication: -1

Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, Miles Cranmer, Geraud Krawezik, Francois Lanusse, Michael McCabe, Ruben Ohana, Liam Parker · (mp.weixin.qq)
GraphGPT: Graph Instruction Tuning for Large Language Models, arXiv, 2310.13023, arxiv, pdf, cication: 2

Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, Suqi Cheng, Dawei Yin, Chao Huang · (mp.weixin.qq)
A taxonomy and review of generalization research in NLP | Nature Machine Intelligence
Neurons in Large Language Models: Dead, N-gram, Positional, arXiv, 2309.04827, arxiv, pdf, cication: -1

Elena Voita, Javier Ferrando, Christoforos Nalmpantis
ACL 2023最佳论文出炉！CMU西交大等摘桂冠，杰出论文奖华人学者占半壁江山

Impacts

MIT新研究：打工人不用担心被AI淘汰！成本巨贵，视觉工作只有23%可替代

Course & Tutorial

LLMs-from-scratch - rasbt

Implementing a ChatGPT-like LLM from scratch, step by step
MachineLearning-QandAI-book - rasbt

Machine Learning Q and AI book
ML-YouTube-Courses - dair-ai

📺 Discover the latest machine learning / AI courses on YouTube.
llm-course - mlabonne

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
[1hr Talk] Intro to Large Language Models - YouTube

· (drive.google) · (drive.google)

· (mp.weixin.qq)
ML 2023 Spring
80分鐘快速了解大型語言模型 (5:30 有咒術迴戰雷) - YouTube
Stanford CS224N: Natural Language Processing with Deep Learning | 2023 - YouTube

CUDA

introduce CUDA in a way that will be accessible to Python folks

· (youtu)

Extra reference

how-to-optim-algorithm-in-cuda - BBuf

how to optimize some algorithm in cuda.
MLC-LLM 支持RWKV-5推理以及对RWKV-5的一些思考

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

awesome_llm_misc.md

awesome_llm_misc.md

Awesome llm misc

Survey

Blogs

Toolkits

Unlearning

Personality

World Model

Red teaming (safety)

Forecasting

Chat arena

New model

LLM detection

Interpretability

Generaliazation

LLM editting

AGI insights

Callibration

Books

Privacy

Misc

Impacts

Course & Tutorial

CUDA

Extra reference

Files

awesome_llm_misc.md

Latest commit

History

awesome_llm_misc.md

File metadata and controls

Awesome llm misc

Survey

Blogs

Toolkits

Unlearning

Personality

World Model

Red teaming (safety)

Forecasting

Chat arena

New model

LLM detection

Interpretability

Generaliazation

LLM editting

AGI insights

Callibration

Books

Privacy

Misc

Impacts

Course & Tutorial

CUDA

Extra reference