-
From Google Gemini to OpenAI Q (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape*,
arXiv, 2312.10868
, arxiv, pdf, cication: -1Timothy R. McIntosh, Teo Susnjak, Tong Liu, Paul Watters, Malka N. Halgamuge
-
A Survey of Large Language Models Attribution,
arXiv, 2311.03731
, arxiv, pdf, cication: -1Dongfang Li, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Ziyang Chen, Baotian Hu, Aiguo Wu, Min Zhang · (awesome-llm-attributions - HITsz-TMG)
-
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models,
arXiv, 2307.09793
, arxiv, pdf, cication: 1Sarah Gao, Andrew Kean Gao · (constellation.sites.stanford)
-
A Survey of Large Language Models,
arXiv, 2303.18223
, arxiv, pdf, cication: 285Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong · (LLMSurvey - RUCAIBox)
-
The History of Open-Source LLMs: Imitation and Alignment (Part Three)
-
Transformer Taxonomy (the last lit review) | kipply's blog
· (jiqizhixin)
-
amazing-openai-api - soulteary
Convert different model APIs into the OpenAI API format out of the box.
-
jan - janhq
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer
-
GPT_API_free - chatanywhere
Free ChatGPT API Key,免费ChatGPT API,支持GPT4 API(免费),ChatGPT国内可用免费转发API,直连无需代理。可以搭配ChatBox等软件/插件使用,极大降低接口使用成本。国内即可无限制畅快聊天。
-
BricksLLM - bricks-cloud
Simplifying LLM ops in production
-
skypilot - skypilot-org
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
-
vllm - vllm-project
A high-throughput and memory-efficient inference and serving engine for LLMs
-
langflow - logspace-ai
⛓️ LangFlow is a UI for LangChain, designed with react-flow to provide an effortless way to experiment and prototype flows.
-
torchscale - microsoft
Foundation Architecture for (M)LLMs
-
LLM-As-Chatbot - deep-diver
LLM as a Chatbot Service
-
Llama-2-Open-Source-LLM-CPU-Inference - kennethleungty
Running Llama 2 and other Open-Source LLMs on CPU Inference Locally for Document Q&A
-
ollama - jmorganca
Get up and running with large language models locally
-
OpenLLM - bentoml
An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.
-
litellm - BerriAI
Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
-
ollama - jmorganca
Get up and running with Llama 2 and other large language models locally
-
gpu_poor - RahulSChand
Calculate GPU memory requirement & breakdown for training/inference of LLM models. Supports ggml/bnb quantization
-
leptonai - leptonai
A Pythonic framework to simplify AI service building
-
exllamav2 - turboderp
A fast inference library for running LLMs locally on modern consumer-class GPUs
-
outlines - normal-computing
Generative Model Programming
-
one-api - songquanpeng
OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2、智谱 ChatGLM、百度文心一言、讯飞星火认知以及阿里通义千问,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.
-
LLaMA2-Accessory - Alpha-VLLM
An Open-source Toolkit for LLM Development
-
Flowise - FlowiseAI
Drag & drop UI to build your customized LLM flow
-
simpleaichat - minimaxir
Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
-
TypeChat - Microsoft
TypeChat is a library that makes it easy to build natural language interfaces using types.
-
petals - bigscience-workshop
🌸 Run large language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
-
chatbox - Bin-Huang
Chatbox is a desktop app for GPT/LLM that supports Windows, Mac, Linux & Web Online
-
h2o-llmstudio - h2oai
H2O LLM Studio - a framework and no-code GUI for fine-tuning LLMs
-
LMFlow - OptimalScale
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Model for All.
-
FlagAI - FlagAI-Open
FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model.
-
TOFU: A Task of Fictitious Unlearning for LLMs,
arXiv, 2401.06121
, arxiv, pdf, cication: -1Pratyush Maini, Zhili Feng, Avi Schwarzschild, Zachary C. Lipton, J. Zico Kolter
-
Large Language Model Unlearning,
arXiv, 2310.10683
, arxiv, pdf, cication: -1Yuanshun Yao, Xiaojun Xu, Yang Liu
· (jiqizhixin) · (llm_unlearn - kevinyaobytedance)
-
Improving Language Plasticity via Pretraining with Active Forgetting,
arXiv, 2307.01163
, arxiv, pdf, cication: -1Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, Mikel Artetxe
-
Announcing the first Machine Unlearning Challenge – Google Research Blog
-
Large Language Models Understand and Can be Enhanced by Emotional Stimuli,
arXiv, 2307.11760
, arxiv, pdf, cication: 6Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie
-
When Large Language Models Meet Personalization: Perspectives of Challenges and Opportunities,
arXiv, 2307.16376
, arxiv, pdf, cication: 7Jin Chen, Zheng Liu, Xu Huang, Chenwang Wu, Qi Liu, Gangwei Jiang, Yuanhao Pu, Yuxuan Lei, Xiaolong Chen, Xingmei Wang
-
Personality Traits in Large Language Models,
arXiv, 2307.00184
, arxiv, pdf, cication: 17Greg Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Peter Romero, Marwa Abdulhai, Aleksandra Faust, Maja Matarić
-
The Geometry of Truth: Emergent Linear Structure in Large Language Model Representations of True/False Datasets,
arXiv, 2310.06824
, arxiv, pdf, cication: -1Samuel Marks, Max Tegmark · (mp.weixin.qq)
-
Language Models Represent Space and Time,
arXiv, 2310.02207
, arxiv, pdf, cication: 2Wes Gurnee, Max Tegmark · (world-models - wesg52)
-
· (mp.weixin.qq)
-
OpenAI「登月计划」剑指超级AI!LeCun提出AGI之路七阶段,打造世界模型是首位https://mp.weixin.qq.com/s?__biz=MzI3MTA0MTk1MA==&mid=2652419855&idx=3&sn=a5f20e0a0061c01e1ec87d4a81c23e68
-
MART: Improving LLM Safety with Multi-round Automatic Red-Teaming,
arXiv, 2311.07689
, arxiv, pdf, cication: -1Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao
-
Moral Foundations of Large Language Models,
arXiv, 2310.15337
, arxiv, pdf, cication: 7Marwa Abdulhai, Gregory Serapio-Garcia, Clément Crepy, Daria Valter, John Canny, Natasha Jaques
-
FLIRT: Feedback Loop In-context Red Teaming,
arXiv, 2308.04265
, arxiv, pdf, cication: 3Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta
-
Explore, Establish, Exploit: Red Teaming Language Models from Scratch,
arXiv, 2306.09442
, arxiv, pdf, cication: 16Stephen Casper, Jason Lin, Joe Kwon, Gatlen Culp, Dylan Hadfield-Menell
-
Time-LLM: Time Series Forecasting by Reprogramming Large Language Models,
arXiv, 2310.01728
, arxiv, pdf, cication: 17Ming Jin, Shiyu Wang, Lintao Ma, Zhixuan Chu, James Y. Zhang, Xiaoming Shi, Pin-Yu Chen, Yuxuan Liang, Yuan-Fang Li, Shirui Pan · (time-llm - kimmeen)
-
GodMode - smol-ai
AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2! I use this 20 times a day.
-
ChatALL - sunner
Concurrently chat with ChatGPT, Bing Chat, Bard, Alpaca, Vicuna, Claude, ChatGLM, MOSS, 讯飞星火, 文心一言 and more, discover the best answers
-
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens,
arXiv, 2401.17377
, arxiv, pdf, cication: -1Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi
-
Transfer Learning for Text Diffusion Models,
arXiv, 2401.17181
, arxiv, pdf, cication: -1Kehang Han, Kathleen Kenealy, Aditya Barua, Noah Fiedel, Noah Constant
-
🦅 Eagle 7B : Soaring past Transformers with 1 Trillion Tokens Across 100+ Languages (RWKV-v5)
-
MambaByte: Token-free Selective State Space Model,
arXiv, 2401.13660
, arxiv, pdf, cication: -1Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M Rush
-
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts,
arXiv, 2401.04081
, arxiv, pdf, cication: -1Maciej Pióro, Kamil Ciebiera, Krystian Król, Jan Ludziejewski, Sebastian Jaszczur
-
Mamba: Linear-Time Sequence Modeling with Selective State Spaces,
arXiv, 2312.00752
, arxiv, pdf, cication: -1Albert Gu, Tri Dao · (mamba - state-spaces)
-
TCNCA: Temporal Convolution Network with Chunked Attention for Scalable Sequence Processing,
arXiv, 2312.05605
, arxiv, pdf, cication: -1Aleksandar Terzic, Michael Hersche, Geethan Karunaratne, Luca Benini, Abu Sebastian, Abbas Rahimi
-
GIVT: Generative Infinite-Vocabulary Transformers,
arXiv, 2312.02116
, arxiv, pdf, cication: -1Michael Tschannen, Cian Eastwood, Fabian Mentzer
-
Text Rendering Strategies for Pixel Language Models,
arXiv, 2311.00522
, arxiv, pdf, cication: -1Jonas F. Lotz, Elizabeth Salesky, Phillip Rust, Desmond Elliott
-
Retentive Network: A Successor to Transformer for Large Language Models,
arXiv, 2307.08621
, arxiv, pdf, cication: 14Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei
-
Copy Is All You Need,
arXiv, 2307.06962
, arxiv, pdf, cication: 217Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao
-
BiPhone: Modeling Inter Language Phonetic Influences in Text,
arXiv, 2307.03322
, arxiv, pdf, cication: -1Abhirut Gupta, Ananya B. Sai, Richard Sproat, Yuri Vasilevski, James S. Ren, Ambarish Jash, Sukhdeep S. Sodhi, Aravindan Raghuveer
-
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference,
arXiv, 2306.12509
, arxiv, pdf, cication: 4Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux
-
Backpack Language Models,
arXiv, 2305.16765
, arxiv, pdf, cication: 4John Hewitt, John Thickstun, Christopher D. Manning, Percy Liang · (jiqizhixin) · (mp.weixin.qq)
-
HuRef: HUman-REadable Fingerprint for Large Language Models,
arXiv, 2312.04828
, arxiv, pdf, cication: -1Boyi Zeng, Chenghu Zhou, Xinbing Wang, Zhouhan Lin · (jiqizhixin)
-
LLM-generated-text-detection - thunlp
-
Adaptive Text Watermark for Large Language Models,
arXiv, 2401.13927
, arxiv, pdf, cication: -1Yepeng Liu, Yuheng Bu
-
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text,
arXiv, 2401.12070
, arxiv, pdf, cication: -1Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein
-
LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase,
arXiv, 2401.05952
, arxiv, pdf, cication: -1Chujie Gao, Dongping Chen, Qihui Zhang, Yue Huang, Yao Wan, Lichao Sun · (MixSet - Dongping-Chen)
-
A Survey of Text Watermarking in the Era of Large Language Models,
arXiv, 2312.07913
, arxiv, pdf, cication: -1Aiwei Liu, Leyi Pan, Yijian Lu, Jingjing Li, Xuming Hu, Xi Zhang, Lijie Wen, Irwin King, Hui Xiong, Philip S. Yu · (jiqizhixin)
-
Ghostbuster: Detecting Text Ghostwritten by Large Language Models,
arXiv, 2305.15047
, arxiv, pdf, cication: 6Vivek Verma, Eve Fleisig, Nicholas Tomlin, Dan Klein · (bair.berkeley)
-
‘ChatGPT detector’ catches AI-generated papers with unprecedented accuracy
-
GPT detectors are biased against non-native English writers,
arXiv, 2304.02819
, arxiv, pdf, cication: 42Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
-
Three Bricks to Consolidate Watermarks for Large Language Models,
arXiv, 2308.00113
, arxiv, pdf, cication: 3Pierre Fernandez, Antoine Chaffin, Karim Tit, Vivien Chappelier, Teddy Furon
-
Robust Distortion-free Watermarks for Language Models,
arXiv, 2307.15593
, arxiv, pdf, cication: 9Rohith Kuditipudi, John Thickstun, Tatsunori Hashimoto, Percy Liang
-
Can AI-Generated Text be Reliably Detected?,
arXiv, 2303.11156
, arxiv, pdf, cication: 93Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi · (mp.weixin.qq)
-
Digital tool spots academic text spawned by ChatGPT with 99% accuracy | The University of Kansas
· (mp.weixin.qq)
-
Can Large Language Models Understand Context?,
arXiv, 2402.00858
, arxiv, pdf, cication: -1Yilun Zhu, Joel Ruben Antony Moniz, Shruti Bhargava, Jiarui Lu, Dhivya Piraviperumal, Site Li, Yuan Zhang, Hong Yu, Bo-Hsiang Tseng
-
Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models,
arXiv, 2401.06102
, arxiv, pdf, cication: -1Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, Mor Geva
-
Vayu Robotics Blog - Interpretable End-to-End Robot Navigation
-
awesome-llm-interpretability - JShollaj
A curated list of Large Language Model (LLM) Interpretability resources.
-
Challenges with unsupervised LLM knowledge discovery,
arXiv, 2312.10029
, arxiv, pdf, cication: -1Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah
-
Using Captum to Explain Generative Language Models,
arXiv, 2312.05491
, arxiv, pdf, cication: -1Vivek Miglani, Aobo Yang, Aram H. Markosyan, Diego Garcia-Olano, Narine Kokhlikyan
-
Beyond Surface: Probing LLaMA Across Scales and Layers,
arXiv, 2312.04333
, arxiv, pdf, cication: -1Nuo Chen, Ning Wu, Shining Liang, Ming Gong, Linjun Shou, Dongmei Zhang, Jia Li
-
llm-viz - bbycroft
3D Visualization of an GPT-style LLM
-
White-Box Transformers via Sparse Rate Reduction: Compression Is All There Is?,
arXiv, 2311.13110
, arxiv, pdf, cication: -1Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Hao Bai, Yuexiang Zhai, Benjamin D. Haeffele, Yi Ma · (mp.weixin.qq)
-
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models,
arXiv, 2311.00871
, arxiv, pdf, cication: -1Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni · (jiqizhixin)
-
The Generative AI Paradox: "What It Can Create, It May Not Understand",
arXiv, 2311.00059
, arxiv, pdf, cication: -1Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu
-
The Impact of Depth and Width on Transformer Language Model Generalization,
arXiv, 2310.19956
, arxiv, pdf, cication: -1Jackson Petty, Sjoerd van Steenkiste, Ishita Dasgupta, Fei Sha, Dan Garrette, Tal Linzen
-
Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations,
arXiv, 2310.11207
, arxiv, pdf, cication: -1Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, Leilani H. Gilpin
-
Towards Monosemanticity: Decomposing Language Models With Dictionary Learning
· (qbitai)
-
Representation Engineering: A Top-Down Approach to AI Transparency,
arXiv, 2310.01405
, arxiv, pdf, cication: 5Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski · (representation-engineering - andyzoujm)
· (mp.weixin.qq)
-
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models,
arXiv, 2309.15098
, arxiv, pdf, cication: -1Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi
-
Language Modeling Is Compression,
arXiv, 2309.10668
, arxiv, pdf, cication: 7Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau
-
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT),
arXiv, 2309.08968
, arxiv, pdf, cication: -1Parsa Kavehzadeh, Mojtaba Valipour, Marzieh Tahaei, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models,
arXiv, 2309.08600
, arxiv, pdf, cication: 5Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, Lee Sharkey
-
Human Language Understanding & Reasoning
· (mp.weixin.qq)
-
Do Machine Learning Models Memorize or Generalize?
· (mp.weixin.qq)
-
CIMI - Daftstone
· (jiqizhixin)
-
Do Machine Learning Models Memorize or Generalize?
· (qbitai)
-
Studying Large Language Model Generalization with Influence Functions,
arXiv, 2308.03296
, arxiv, pdf, cication: 12Roger Grosse, Juhan Bae, Cem Anil, Nelson Elhage, Alex Tamkin, Amirhossein Tajdini, Benoit Steiner, Dustin Li, Esin Durmus, Ethan Perez
-
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer,
arXiv, 2305.16380
, arxiv, pdf, cication: 6Yuandong Tian, Yiping Wang, Beidi Chen, Simon Du · (mp.weixin.qq)
-
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning,
arXiv, 2305.14160
, arxiv, pdf, cication: -1Lean Wang, Lei Li, Damai Dai, Deli Chen, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun · (qbitai)
-
Time is Encoded in the Weights of Finetuned Language Models,
arXiv, 2312.13401
, arxiv, pdf, cication: -1Kai Nylund, Suchin Gururangan, Noah A. Smith
-
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models,
arXiv, 2311.00871
, arxiv, pdf, cication: -1Steve Yadlowsky, Lyric Doshi, Nilesh Tripuraneni
-
A Comprehensive Study of Knowledge Editing for Large Language Models,
arXiv, 2401.01286
, arxiv, pdf, cication: -1Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni
-
Evaluating the Ripple Effects of Knowledge Editing in Language Models,
arXiv, 2307.12976
, arxiv, pdf, cication: 5Roi Cohen, Eden Biran, Ori Yoran, Amir Globerson, Mor Geva
-
Editing Large Language Models: Problems, Methods, and Opportunities,
arXiv, 2305.13172
, arxiv, pdf, cication: 12Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang · (easyedit - zjunlp)
-
ModelEditingPapers - zjunlp
Must-read Papers on Model Editing.
-
Perspectives on the State and Future of Deep Learning -- 2023,
arXiv, 2312.09323
, arxiv, pdf, cication: -1Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson
-
AI and Open Source in 2023 - by Sebastian Raschka, PhD
· (mp.weixin.qq)
-
Role play with large language models | Nature
· (qbitai)
-
Levels of AGI: Operationalizing Progress on the Path to AGI,
arXiv, 2311.02462
, arxiv, pdf, cication: -1Meredith Ringel Morris, Jascha Sohl-dickstein, Noah Fiedel, Tris Warkentin, Allan Dafoe, Aleksandra Faust, Clement Farabet, Shane Legg
-
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness,
arXiv, 2308.08708
, arxiv, pdf, cication: 15Patrick Butlin, Robert Long, Eric Elmoznino, Yoshua Bengio, Jonathan Birch, Axel Constant, George Deane, Stephen M. Fleming, Chris Frith, Xu Ji · (jiqizhixin)
-
Collective Intelligence for Deep Learning: A Survey of Recent Developments | 大トロ
-
Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation,
arXiv, 2311.08877
, arxiv, pdf, cication: -1Vaishnavi Shrivastava, Percy Liang, Ananya Kumar
-
Do Large Language Models Know What They Don't Know?,
arXiv, 2305.18153
, arxiv, pdf, cication: 16Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang
-
Prompt2Model: Generating Deployable Models from Natural Language Instructions,
arXiv, 2308.12261
, arxiv, pdf, cication: -1Vijay Viswanathan, Chenyang Zhao, Amanda Bertsch, Tongshuang Wu, Graham Neubig · (mp.weixin.qq)
-
xVal: A Continuous Number Encoding for Large Language Models,
arXiv, 2310.02989
, arxiv, pdf, cication: -1Siavash Golkar, Mariel Pettee, Michael Eickenberg, Alberto Bietti, Miles Cranmer, Geraud Krawezik, Francois Lanusse, Michael McCabe, Ruben Ohana, Liam Parker · (mp.weixin.qq)
-
GraphGPT: Graph Instruction Tuning for Large Language Models,
arXiv, 2310.13023
, arxiv, pdf, cication: 2Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Lixin Su, Suqi Cheng, Dawei Yin, Chao Huang · (mp.weixin.qq)
-
A taxonomy and review of generalization research in NLP | Nature Machine Intelligence
-
Neurons in Large Language Models: Dead, N-gram, Positional,
arXiv, 2309.04827
, arxiv, pdf, cication: -1Elena Voita, Javier Ferrando, Christoforos Nalmpantis
-
LLMs-from-scratch - rasbt
Implementing a ChatGPT-like LLM from scratch, step by step
-
MachineLearning-QandAI-book - rasbt
Machine Learning Q and AI book
-
ML-YouTube-Courses - dair-ai
📺 Discover the latest machine learning / AI courses on YouTube.
-
llm-course - mlabonne
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
-
[1hr Talk] Intro to Large Language Models - YouTube
· (drive.google) · (drive.google)
· (mp.weixin.qq)
-
Stanford CS224N: Natural Language Processing with Deep Learning | 2023 - YouTube
-
how-to-optim-algorithm-in-cuda - BBuf
how to optimize some algorithm in cuda.