Skip to content

Latest commit

 

History

History
616 lines (433 loc) · 65.9 KB

awesome_llm_reasoning.md

File metadata and controls

616 lines (433 loc) · 65.9 KB

Awesome llm reasoning

Survey

  • A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook, arXiv, 2312.11562, arxiv, pdf, cication: -1

    Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, Hongyang Li, Mengzhe Geng · (Awesome-Reasoning-Foundation-Models - reasoning-survey) Star · (mp.weixin.qq)

Reasoning

  • CogGPT: Unleashing the Power of Cognitive Dynamics on Large Language Models, arXiv, 2401.08438, arxiv, pdf, cication: -1

    Yaojia Lv, Haojie Pan, Ruiji Fu, Ming Liu, Zhongyuan Wang, Bing Qin · (CogGPT - KwaiKEG) Star

  • The Impact of Reasoning Step Length on Large Language Models, arXiv, 2401.04925, arxiv, pdf, cication: -1 jj Mingyu Jin, Qinkai Yu, Dong shu, Haiyan Zhao, Wenyue Hua, Yanda Meng, Yongfeng Zhang, Mengnan Du

    · (jiqizhixin)

  • Reasons to Reject? Aligning Language Models with Judgments, arXiv, 2312.14591, arxiv, pdf, cication: -1

    Weiwen Xu, Deng Cai, Zhisong Zhang, Wai Lam, Shuming Shi

  • The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction, arXiv, 2312.13558, arxiv, pdf, cication: -1

    Pratyusha Sharma, Jordan T. Ash, Dipendra Misra

    · (pratyushasharma.github) · (jiqizhixin)

  • Self-Evaluation Improves Selective Generation in Large Language Models, arXiv, 2312.09300, arxiv, pdf, cication: -1

    Jie Ren, Yao Zhao, Tu Vu, Peter J. Liu, Balaji Lakshminarayanan

  • Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation, arXiv, 2312.02439, arxiv, pdf, cication: -1

    Shanshan Zhong, Zhongzhan Huang, Shanghua Gao, Wushao Wen, Liang Lin, Marinka Zitnik, Pan Zhou · (clot - sail-sg) Star

  • PathFinder: Guided Search over Multi-Step Reasoning Paths, arXiv, 2312.05180, arxiv, pdf, cication: -1

    Olga Golovneva, Sean O'Brien, Ramakanth Pasunuru, Tianlu Wang, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

  • Training Chain-of-Thought via Latent-Variable Inference, arXiv, 2312.02179, arxiv, pdf, cication: -1

    Du Phan, Matthew D. Hoffman, David Dohan, Sholto Douglas, Tuan Anh Le, Aaron Parisi, Pavel Sountsov, Charles Sutton, Sharad Vikram, Rif A. Saurous

  • LLMs cannot find reasoning errors, but can correct them!, arXiv, 2311.08516, arxiv, pdf, cication: -1

    Gladys Tyen, Hassan Mansoor, Peter Chen, Tony Mak, Victor Cărbune · (BIG-Bench-Mistake - WHGTyen) Star · (jiqizhixin)

  • System 2 Attention (is something you might need too), arXiv, 2311.11829, arxiv, pdf, cication: -1

    Jason Weston, Sainbayar Sukhbaatar

  • Orca 2: Teaching Small Language Models How to Reason, arXiv, 2311.11045, arxiv, pdf, cication: -1

    Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agrawal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal

  • Thread of Thought Unraveling Chaotic Contexts, arXiv, 2311.08734, arxiv, pdf, cication: -1

    Yucheng Zhou, Xiubo Geng, Tao Shen, Chongyang Tao, Guodong Long, Jian-Guang Lou, Jianbing Shen

  • UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations, arXiv, 2311.08469, arxiv, pdf, cication: -1

    Wenting Zhao, Justin T Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li, Alane Suhr

  • The ART of LLM Refinement: Ask, Refine, and Trust, arXiv, 2311.07961, arxiv, pdf, cication: -1

    Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ram Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz

  • ADaPT: As-Needed Decomposition and Planning with Language Models, arXiv, 2311.05772, arxiv, pdf, cication: -1

    Archiki Prasad, Alexander Koller, Mareike Hartmann, Peter Clark, Ashish Sabharwal, Mohit Bansal, Tushar Khot

  • Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation, arXiv, 2311.04254, arxiv, pdf, cication: -1

    Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei Zhang, Si Qin, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang

  • Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves, arXiv, 2311.04205, arxiv, pdf, cication: -1

    Yihe Deng, Weitong Zhang, Zixiang Chen, Quanquan Gu · (Rephrase-and-Respond - uclaml) Star

  • Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models, arXiv, 2310.06117, arxiv, pdf, cication: -1

    Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, Heng-Tze Cheng, Ed H. Chi, Quoc V Le, Denny Zhou

  • Learning From Mistakes Makes LLM Better Reasoner, arXiv, 2310.20689, arxiv, pdf, cication: -1

    Shengnan An, Zexiong Ma, Zeqi Lin, Nanning Zheng, Jian-Guang Lou, Weizhu Chen

  • Branch-Solve-Merge Improves Large Language Model Evaluation and Generation, arXiv, 2310.15123, arxiv, pdf, cication: -1

    Swarnadeep Saha, Omer Levy, Asli Celikyilmaz, Mohit Bansal, Jason Weston, Xian Li

  • Democratizing Reasoning Ability: Tailored Learning from Large Language Model, arXiv, 2310.13332, arxiv, pdf, cication: -1

    Zhaoyang Wang, Shaohan Huang, Yuxuan Liu, Jiahai Wang, Minghui Song, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun

  • Teaching Language Models to Self-Improve through Interactive Demonstrations, arXiv, 2310.13522, arxiv, pdf, cication: -1

    Xiao Yu, Baolin Peng, Michel Galley, Jianfeng Gao, Zhou Yu

  • automix - automix-llm Star

    Mixing Language Models with Self-Verification and Meta-Verification

  • The Consensus Game: Language Model Generation via Equilibrium Search, arXiv, 2310.09139, arxiv, pdf, cication: -1

    Athul Paul Jacob, Yikang Shen, Gabriele Farina, Jacob Andreas · (qbitai)

  • When can transformers reason with abstract symbols?, arXiv, 2310.09753, arxiv, pdf, cication: -1

    Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind

  • Avalon's Game of Thoughts: Battle Against Deception through Recursive Contemplation, arXiv, 2310.01320, arxiv, pdf, cication: 2

    Shenzhi Wang, Chang Liu, Zilong Zheng, Siyuan Qi, Shuo Chen, Qisen Yang, Andrew Zhao, Chaofei Wang, Shiji Song, Gao Huang · (jiqizhixin)

  • Large Language Models can Learn Rules, arXiv, 2310.07064, arxiv, pdf, cication: -1

    Zhaocheng Zhu, Yuan Xue, Xinyun Chen, Denny Zhou, Jian Tang, Dale Schuurmans, Hanjun Dai · (mp.weixin.qq)

  • Large Language Models as Analogical Reasoners, arXiv, 2310.01714, arxiv, pdf, cication: -1

    Michihiro Yasunaga, Xinyun Chen, Yujia Li, Panupong Pasupat, Jure Leskovec, Percy Liang, Ed H. Chi, Denny Zhou

  • Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models, arXiv, 2310.04406, arxiv, pdf, cication: -1

    Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, Yu-Xiong Wang · (andyz245.github) · (LanguageAgentTreeSearch - andyz245) Star

  • Thought Propagation: An Analogical Approach to Complex Reasoning with Large Language Models, arXiv, 2310.03965, arxiv, pdf, cication: -1

    Junchi Yu, Ran He, Rex Ying · (mp.weixin.qq)

  • Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning, arXiv, 2310.03094, arxiv, pdf, cication: 1

    Murong Yue, Jie Zhao, Min Zhang, Liang Du, Ziyu Yao

  • Large Language Models as Analogical Reasoners, arXiv, 2310.01714, arxiv, pdf, cication: -1

    Michihiro Yasunaga, Xinyun Chen, Yujia Li, Panupong Pasupat, Jure Leskovec, Percy Liang, Ed H. Chi, Denny Zhou

  • Large Language Models Cannot Self-Correct Reasoning Yet, arXiv, 2310.01798, arxiv, pdf, cication: 5

    Jie Huang, Xinyun Chen, Swaroop Mishra, Huaixiu Steven Zheng, Adams Wei Yu, Xinying Song, Denny Zhou

  • Enable Language Models to Implicitly Learn Self-Improvement From Data, arXiv, 2310.00898, arxiv, pdf, cication: -1

    Ziqi Wang, Le Hou, Tianjian Lu, Yuexin Wu, Yunxuan Li, Hongkun Yu, Heng Ji

  • Evaluating Cognitive Maps and Planning in Large Language Models with CogEval, arXiv, 2309.15129, arxiv, pdf, cication: -1

    Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson

  • reconcile - dinobby Star

  • DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines, arXiv, 2310.03714, arxiv, pdf, cication: 1

    Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam · (dspy - stanfordnlp) Star

  • Physics of Language Models: Part 3.2, Knowledge Manipulation, arXiv, 2309.14402, arxiv, pdf, cication: -1

    Zeyuan Allen-Zhu, Yuanzhi Li · (jiqizhixin) · (mp.weixin.qq)

  • Cumulative Reasoning with Large Language Models, arXiv, 2308.04371, arxiv, pdf, cication: 10

    Yifan Zhang, Jingqin Yang, Yang Yuan, Andrew Chi-Chih Yao · (qbitai)

  • SCREWS: A Modular Framework for Reasoning with Revisions, arXiv, 2309.13075, arxiv, pdf, cication: -1

    Kumar Shridhar, Harsh Jhamtani, Hao Fang, Benjamin Van Durme, Jason Eisner, Patrick Xia

  • Contrastive Decoding Improves Reasoning in Large Language Models, arXiv, 2309.09117, arxiv, pdf, cication: -1

    Sean O'Brien, Mike Lewis

  • Compositional Foundation Models for Hierarchical Planning, arXiv, 2309.08587, arxiv, pdf, cication: 1

    Anurag Ajay, Seungwook Han, Yilun Du, Shuang Li, Abhi Gupta, Tommi Jaakkola, Josh Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

  • Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration, arXiv, 2307.05300, arxiv, pdf, cication: 25

    Zhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge, Furu Wei, Heng Ji · (solo-performance-prompting - mikewangwzhl) Star

  • SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning, arXiv, 2308.00436, arxiv, pdf, cication: 6

    Ning Miao, Yee Whye Teh, Tom Rainforth

  • Measuring Faithfulness in Chain-of-Thought Reasoning, arXiv, 2307.13702, arxiv, pdf, cication: 5

    Tamera Lanham, Anna Chen, Ansh Radhakrishnan, Benoit Steiner, Carson Denison, Danny Hernandez, Dustin Li, Esin Durmus, Evan Hubinger, Jackson Kernion

  • Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance - ACL Anthology

    · (SCPRG-master - LWL-cpu) Star

  • Question Decomposition Improves the Faithfulness of Model-Generated Reasoning, arXiv, 2307.11768, arxiv, pdf, cication: 6

    Ansh Radhakrishnan, Karina Nguyen, Anna Chen, Carol Chen, Carson Denison, Danny Hernandez, Esin Durmus, Evan Hubinger, Jackson Kernion, Kamilė Lukošiūtė

  • Promoting Exploration in Memory-Augmented Adam using Critical Momenta, arXiv, 2307.09638, arxiv, pdf, cication: -1

    Pranshu Malviya, Gonçalo Mordido, Aristide Baratin, Reza Babanezhad Harikandeh, Jerry Huang, Simon Lacoste-Julien, Razvan Pascanu, Sarath Chandar

  • Does Visual Pretraining Help End-to-End Reasoning?, arXiv, 2307.08506, arxiv, pdf, cication: -1

    Chen Sun, Calvin Luo, Xingyi Zhou, Anurag Arnab, Cordelia Schmid

  • Self-consistency for open-ended generations, arXiv, 2307.06857, arxiv, pdf, cication: 3

    Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang

  • Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration, arXiv, 2307.05300, arxiv, pdf, cication: 25

    Zhenhailong Wang, Shaoguang Mao, Wenshan Wu, Tao Ge, Furu Wei, Heng Ji · (Solo-Performance-Prompting.git - MikeWangWZHL) Star

  • Curious Replay for Model-based Adaptation, arXiv, 2306.15934, arxiv, pdf, cication: -1

    Isaac Kauvar, Chris Doyle, Linqi Zhou, Nick Haber · (curiousreplay - AutonomousAgentsLab) Star · (mp.weixin.qq)

  • PokemonChat: Auditing ChatGPT for Pokémon Universe Knowledge, arXiv, 2306.03024, arxiv, pdf, cication: 1

    Laura Cabello, Jiaang Li, Ilias Chalkidis

  • Orca: Progressive Learning from Complex Explanation Traces of GPT-4, arXiv, 2306.02707, arxiv, pdf, cication: 32

    Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah · (mp.weixin.qq)

  • REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction, arXiv, 2306.15724, arxiv, pdf, cication: 13

    Zeyi Liu, Arpit Bahety, Shuran Song

  • From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought, arXiv, 2306.12672, arxiv, pdf, cication: 9

    Lionel Wong, Gabriel Grand, Alexander K. Lew, Noah D. Goodman, Vikash K. Mansinghka, Jacob Andreas, Joshua B. Tenenbaum

  • DERA: Enhancing Large Language Model Completions with Dialog-Enabled Resolving Agents, arXiv, 2303.17071, arxiv, pdf, cication: 20

    Varun Nair, Elliot Schumacher, Geoffrey Tso, Anitha Kannan

  • Toward Grounded Social Reasoning, arXiv, 2306.08651, arxiv, pdf, cication: 2

    Minae Kwon, Hengyuan Hu, Vivek Myers, Siddharth Karamcheti, Anca Dragan, Dorsa Sadigh

  • tart - hazyresearch Star

    TART: A plug-and-play Transformer module for task-agnostic reasoning

  • SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks, arXiv, 2305.17390, arxiv, pdf, cication: 7

    Bill Yuchen Lin, Yicheng Fu, Karina Yang, Prithviraj Ammanabrolu, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Yejin Choi, Xiang Ren · (yuchenlin) · (jiqizhixin)

  • TART: A plug-and-play Transformer module for task-agnostic reasoning, arXiv, 2306.07536, arxiv, pdf, cication: -1

    Kush Bhatia, Avanika Narayan, Christopher De Sa, Christopher Ré

  • Can Large Language Models Infer Causation from Correlation?, arXiv, 2306.05836, arxiv, pdf, cication: 12

    Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, Bernhard Schölkopf

  • Certified Deductive Reasoning with Language Models, arXiv, 2306.04031, arxiv, pdf, cication: 2

    Gabriel Poesia, Kanishk Gandhi, Eric Zelikman, Noah D. Goodman

  • Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks, arXiv, 2306.04009, arxiv, pdf, cication: -1

    Kanishka Misra, Cicero Nogueira dos Santos, Siamak Shakeri

  • Thought-Cloning - ShengranHu Star

    Thought Cloning: Learning to Think while Acting by Imitating Human Thinking

  • PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning, arXiv, 2305.19472, arxiv, pdf, cication: 1

    Faeze Brahman, Chandra Bhagavatula, Valentina Pyatkin, Jena D. Hwang, Xiang Lorraine Li, Hirona J. Arai, Soumya Sanyal, Keisuke Sakaguchi, Xiang Ren, Yejin Choi

  • Think Before You Act: Decision Transformers with Internal Working Memory, arXiv, 2305.16338, arxiv, pdf, cication: 2

    Jikun Kang, Romain Laroche, Xindi Yuan, Adam Trischler, Xue Liu, Jie Fu

  • Improving Factuality and Reasoning in Language Models through Multiagent Debate, arXiv, 2305.14325, arxiv, pdf, cication: 52

    Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch · (composable-models.github)

  • LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond, arXiv, 2305.14540, arxiv, pdf, cication: 8

    Philippe Laban, Wojciech Kryściński, Divyansh Agarwal, Alexander R. Fabbri, Caiming Xiong, Shafiq Joty, Chien-Sheng Wu

  • Language Models of Code are Few-Shot Commonsense Learners, arXiv, 2210.07128, arxiv, pdf, cication: 54

    Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig

  • OPT-R: Exploring the Role of Explanations in Finetuning and Prompting for Reasoning Skills of Large Language Models, arXiv, 2305.12001, arxiv, pdf, cication: 2

    Badr AlKhamissi, Siddharth Verma, Ping Yu, Zhijing Jin, Asli Celikyilmaz, Mona Diab

  • Introspective Tips: Large Language Model for In-Context Decision Making, arXiv, 2305.11598, arxiv, pdf, cication: 3

    Liting Chen, Lu Wang, Hang Dong, Yali Du, Jie Yan, Fangkai Yang, Shuang Li, Pu Zhao, Si Qin, Saravan Rajmohan

  • CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing, arXiv, 2305.11738, arxiv, pdf, cication: 19

    Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Nan Duan, Weizhu Chen

  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models, arXiv, 2305.10601, arxiv, pdf, cication: 188

    Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan

  • Imitation versus Innovation: What children can do that large language and language-and-vision models cannot (yet)?, arXiv, 2305.07666, arxiv, pdf, cication: -1

    Eunice Yiu, Eliza Kosoy, Alison Gopnik · (mp.weixin.qq)

  • chameleon-llm - lupantech Star

  • Boosting Theory-of-Mind Performance in Large Language Models via Prompting, arXiv, 2304.11490, arxiv, pdf, cication: 9

    Shima Rahimi Moghaddam, Christopher J. Honey

  • Large Language Models Are Reasoning Teachers, arXiv, 2212.10071, arxiv, pdf, cication: 59

    Namgyu Ho, Laura Schmid, Se-Young Yun · (reasoning-teacher - itsnamgyu) Star

Other

Math reasoning

  • Large Language Models for Mathematical Reasoning: Progresses and Challenges, arXiv, 2402.00157, arxiv, pdf, cication: -1

    Janice Ahn, Rishu Verma, Renze Lou, Di Liu, Rui Zhang, Wenpeng Yin

  • Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math, arXiv, 2312.17120, arxiv, pdf, cication: -1

    Zengzhi Wang, Rui Xia, Pengfei Liu · (huggingface)

    · (jiqizhixin)

  • Mathematical discoveries from program search with large language models | Nature

    · (funsearch - google-deepmind) Star

  • Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent, arXiv, 2312.08926, arxiv, pdf, cication: -1

    Haoran Liao, Qinyi Du, Shaohua Hu, Hao He, Yanyan Xu, Jidong Tian, Yaohui Jin

  • TinyGSM: achieving >80% on GSM8k with small language models, arXiv, 2312.09241, arxiv, pdf, cication: -1

    Bingbin Liu, Sebastien Bubeck, Ronen Eldan, Janardhan Kulkarni, Yuanzhi Li, Anh Nguyen, Rachel Ward, Yi Zhang

  • FunSearch: Making new discoveries in mathematical sciences using Large Language Models - Google DeepMind

  • Mathematical Language Models: A Survey, arXiv, 2312.07622, arxiv, pdf, cication: -1

    Wentao Liu, Hanglei Hu, Jie Zhou, Yuyang Ding, Junsong Li, Jiayi Zeng, Mengliang He, Qin Chen, Bo Jiang, Aimin Zhou

  • LeanCopilot - lean-dojo Star

    LLMs as Copilots for Theorem Proving in Lean

  • Large Language Models for Mathematicians, arXiv, 2312.04556, arxiv, pdf, cication: -1

    Simon Frieder, Julius Berner, Philipp Petersen, Thomas Lukasiewicz

  • LEGO-Prover: Neural Theorem Proving with Growing Libraries, arXiv, 2310.00656, arxiv, pdf, cication: 1

    Haiming Wang, Huajian Xin, Chuanyang Zheng, Lin Li, Zhengying Liu, Qingxing Cao, Yinya Huang, Jing Xiong, Han Shi, Enze Xie · (jiqizhixin)

  • ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, arXiv, 2309.17452, arxiv, pdf, cication: 1

    Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen · (mp.weixin.qq)

  • Improving Large Language Model Fine-tuning for Solving Math Problems, arXiv, 2310.10047, arxiv, pdf, cication: -1

    Yixin Liu, Avi Singh, C. Daniel Freeman, John D. Co-Reyes, Peter J. Liu

  • Llemma: An Open Language Model For Mathematics, arXiv, 2310.10631, arxiv, pdf, cication: 2

    Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen McAleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck

  • Query and Response Augmentation Cannot Help Out-of-domain Math Reasoning Generalization, arXiv, 2310.05506, arxiv, pdf, cication: 1

    Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou

  • MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning, arXiv, 2310.03731, arxiv, pdf, cication: -1

    Ke Wang, Houxing Ren, Aojun Zhou, Zimu Lu, Sichun Luo, Weikang Shi, Renrui Zhang, Linqi Song, Mingjie Zhan, Hongsheng Li · (qbitai)

  • ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving, arXiv, 2309.17452, arxiv, pdf, cication: 1

    Zhibin Gou, Zhihong Shao, Yeyun Gong, Yelong Shen, Yujiu Yang, Minlie Huang, Nan Duan, Weizhu Chen · (tora - microsoft) Star

  • MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models, arXiv, 2309.12284, arxiv, pdf, cication: 10

    Longhui Yu, Weisen Jiang, Han Shi, Jincheng Yu, Zhengying Liu, Yu Zhang, James T. Kwok, Zhenguo Li, Adrian Weller, Weiyang Liu · (jiqizhixin)

  • MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning, arXiv, 2309.05653, arxiv, pdf, cication: 4

    Xiang Yue, Xingwei Qu, Ge Zhang, Yao Fu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen · (jiqizhixin) · (mp.weixin.qq)

  • abel - GAIR-NLP Star

    SOTA Math Opensource LLM · (qbitai)

  • Large Language Model for Science: A Study on P vs. NP, arXiv, 2309.05689, arxiv, pdf, cication: 2

    Qingxiu Dong, Li Dong, Ke Xu, Guangyan Zhou, Yaru Hao, Zhifang Sui, Furu Wei · (jiqizhixin)

  • Large Language Models as Optimizers, arXiv, 2309.03409, arxiv, pdf, cication: 32

    Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen · (qbitai)

  • GPT Can Solve Mathematical Problems Without a Calculator, arXiv, 2309.03241, arxiv, pdf, cication: -1

    Zhen Yang, Ming Ding, Qingsong Lv, Zhihuan Jiang, Zehai He, Yuyi Guo, Jinfeng Bai, Jie Tang · (mathglm - thudm) Star

  • When Do Program-of-Thoughts Work for Reasoning?, arXiv, 2308.15452, arxiv, pdf, cication: 2

    Zhen Bi, Ningyu Zhang, Yinuo Jiang, Shumin Deng, Guozhou Zheng, Huajun Chen · (mp.weixin.qq)

  • Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification, arXiv, 2308.07921, arxiv, pdf, cication: 11

    Aojun Zhou, Ke Wang, Zimu Lu, Weikang Shi, Sichun Luo, Zipeng Qin, Shaoqing Lu, Anya Jia, Linqi Song, Mingjie Zhan · (qbitai) · (mp.weixin.qq)

  • Scaling Relationship on Learning Mathematical Reasoning with Large Language Models, arXiv, 2308.01825, arxiv, pdf, cication: 17

    Zheng Yuan, Hongyi Yuan, Chengpeng Li, Guanting Dong, Keming Lu, Chuanqi Tan, Chang Zhou, Jingren Zhou · (gsm8k-screl - ofa-sys) Star

  • Self-Supervised Learning with Lie Symmetries for Partial Differential Equations, arXiv, 2307.05432, arxiv, pdf, cication: 2

    Grégoire Mialon, Quentin Garrido, Hannah Lawrence, Danyal Rehman, Yann LeCun, Bobak T. Kiani

  • Teaching Arithmetic to Small Transformers, arXiv, 2307.03381, arxiv, pdf, cication: 6

    Nayoung Lee, Kartik Sreenivasan, Jason D. Lee, Kangwook Lee, Dimitris Papailiopoulos

  • Length Generalization in Arithmetic Transformers, arXiv, 2306.15400, arxiv, pdf, cication: 6

    Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

  • [2305.14201] Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks

    · (mp.weixin.qq) · (mp.weixin.qq)

  • An Empirical Study on Challenging Math Problem Solving with GPT-4, arXiv, 2306.01337, arxiv, pdf, cication: 7

    Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

  • Evaluating Language Models for Mathematics through Interactions, arXiv, 2306.01694, arxiv, pdf, cication: 7

    Katherine M. Collins, Albert Q. Jiang, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart

  • Let's Verify Step by Step, arXiv, 2305.20050, arxiv, pdf, cication: 65

    Hunter Lightman, Vineet Kosaraju, Yura Burda, Harri Edwards, Bowen Baker, Teddy Lee, Jan Leike, John Schulman, Ilya Sutskever, Karl Cobbe · (openai) · (mp.weixin.qq)

  • GitHub - zwq2018/Multi-view-Consistency-for-MWP: EMNLP22: Multi-View Reasoning: Consistent Contrastive Learning for Math Word Problem

Benchmarks

  • MathVista: Evaluating Math Reasoning in Visual Contexts with GPT-4V, Bard, and Other Large Multimodal Models, arXiv, 2310.02255, arxiv, pdf, cication: 3

    Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao · (huggingface) · (mathvista.github)

Other

Self correction

  • Can Large Language Models Really Improve by Self-critiquing Their Own Plans?, arXiv, 2310.08118, arxiv, pdf, cication: 1

    Karthik Valmeekam, Matthew Marquez, Subbarao Kambhampati · (jiqizhixin)

  • Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection, arXiv, 2310.11511, arxiv, pdf, cication: -1

    Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi

  • GPT-4 Doesn't Know It's Wrong: An Analysis of Iterative Prompting for Reasoning Problems, arXiv, 2310.12397, arxiv, pdf, cication: 2

    Kaya Stechly, Matthew Marquez, Subbarao Kambhampati · (mp.weixin.qq)

  • Shepherd: A Critic for Language Model Generation, arXiv, 2308.04592, arxiv, pdf, cication: 6

    Tianlu Wang, Ping Yu, Xiaoqing Ellen Tan, Sean O'Brien, Ramakanth Pasunuru, Jane Dwivedi-Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz · (shepherd - facebookresearch) Star · (mp.weixin.qq)

Prompting

  • LangGPT - EmbraceAGI Star

    LangGPT: Empowering everyone to become a prompt expert!🚀 Structured Prompt,Language of GPT, 结构化提示词,结构化Prompt

  • Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding, arXiv, 2401.12954, arxiv, pdf, cication: -1

    Mirac Suzgun, Adam Tauman Kalai

    · (mp.weixin.qq)

  • Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4, arXiv, 2312.16171, arxiv, pdf, cication: -1

    Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen

  • promptbase - microsoft Star

    All things prompt engineering

  • plum - research4pan Star

    Prompt Learning using Metaheuristics

  • Contrastive Chain-of-Thought Prompting, arXiv, 2311.09277, arxiv, pdf, cication: -1

    Yew Ken Chia, Guizhen Chen, Luu Anh Tuan, Soujanya Poria, Lidong Bing

  • Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster, arXiv, 2311.08263, arxiv, pdf, cication: -1

    Hongxuan Zhang, Zhining Liu, Jiaqi Zheng, Chenyi Zhuang, Jinjie Gu, Guihai Chen

  • Prompt Engineering a Prompt Engineer, arXiv, 2311.05661, arxiv, pdf, cication: -1

    Qinyuan Ye, Maxamed Axmed, Reid Pryzant, Fereshte Khani

  • TopicGPT: A Prompt-based Topic Modeling Framework, arXiv, 2311.01449, arxiv, pdf, cication: -1

    Chau Minh Pham, Alexander Hoyle, Simeng Sun, Mohit Iyyer · (topicgpt - chtmp223) Star

  • Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting, arXiv, 2310.11324, arxiv, pdf, cication: -1

    Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr

  • Eliciting Human Preferences with Language Models, arXiv, 2310.11589, arxiv, pdf, cication: -1

    Belinda Z. Li, Alex Tamkin, Noah Goodman, Jacob Andreas · (qbitai)

  • Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models, arXiv, 2310.06692, arxiv, pdf, cication: -1

    Anni Zou, Zhuosheng Zhang, Hai Zhao, Xiangru Tang

  • ChatGPT-AutoExpert - spdustin Star

    🚀🧠💬 Supercharged Custom Instructions for ChatGPT (non-coding) and ChatGPT Advanced Data Analysis (coding).

  • VPA: Fully Test-Time Visual Prompt Adaptation, proceedings of the 31st acm international conference on multimedia, 2023, arxiv, pdf, cication: -1

    Jiachen Sun, Mark Ibrahim, Melissa Hall, Ivan Evtimov, Z. Morley Mao, Cristian Canton Ferrer, Caner Hazirbas

  • Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers, arXiv, 2309.08532, arxiv, pdf, cication: 8

    Qingyan Guo, Rui Wang, Junliang Guo, Bei Li, Kaitao Song, Xu Tan, Guoqing Liu, Jiang Bian, Yujiu Yang · (mp.weixin.qq)

  • From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting, arXiv, 2309.04269, arxiv, pdf, cication: 1

    Griffin Adams, Alexander Fabbri, Faisal Ladhak, Eric Lehman, Noémie Elhadad · (huggingface)

  • Structured Chain-of-Thought Prompting for Code Generation, arXiv, 2305.06599, arxiv, pdf, cication: 6

    Jia Li, Ge Li, Yongmin Li, Zhi Jin

  • Large Language Models Understand and Can be Enhanced by Emotional Stimuli, arXiv, 2307.11760, arxiv, pdf, cication: 6

    Cheng Li, Jindong Wang, Yixuan Zhang, Kaijie Zhu, Wenxin Hou, Jianxun Lian, Fang Luo, Qiang Yang, Xing Xie

  • thor-isa - scofield7419 Star

    Codes for ACL 2023 paper: Reasoning Implicit Sentiment with Chain-of-Thought Prompting

  • awesome-chatgpt-prompts-zh - PlexPt Star

    ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。

  • InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models, arXiv, 2306.03082, arxiv, pdf, cication: 8

    Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou · (instructzero - lichang-chen) Star

  • PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts, arXiv, 2306.04528, arxiv, pdf, cication: 32

    Kaijie Zhu, Jindong Wang, Jiaheng Zhou, Zichen Wang, Hao Chen, Yidong Wang, Linyi Yang, Wei Ye, Yue Zhang, Neil Zhenqiang Gong

  • Deductive Verification of Chain-of-Thought Reasoning, arXiv, 2306.03872, arxiv, pdf, cication: 10

    Zhan Ling, Yunhao Fang, Xuanlin Li, Zhiao Huang, Mingu Lee, Roland Memisevic, Hao Su

  • Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance, arXiv, 2305.17306, arxiv, pdf, cication: 3

    Yao Fu, Litu Ou, Mingyu Chen, Yuhao Wan, Hao Peng, Tushar Khot · (mp.weixin.qq)

  • Graph of Thoughts: Solving Elaborate Problems with Large Language Models, arXiv, 2308.09687, arxiv, pdf, cication: 32

    Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Michal Podstawski, Hubert Niewiadomski, Piotr Nyczyk · (graph-of-thoughts - spcl) Star · (jiqizhixin)

Other

In context learning

  • Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers, arXiv, 2401.01974, arxiv, pdf, cication: -1

    Aleksandar Stanić, Sergi Caelles, Michael Tschannen

  • Supervised Knowledge Makes Large Language Models Better In-context Learners, arXiv, 2312.15918, arxiv, pdf, cication: -1

    Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen

  • In-Context Learning with Iterative Demonstration Selection, arXiv, 2310.09881, arxiv, pdf, cication: -1

    Chengwei Qin, Aston Zhang, Anirudh Dagar, Wenming Ye · (mp.weixin.qq)

  • In-Context Learning Creates Task Vectors, arXiv, 2310.15916, arxiv, pdf, cication: 1

    Roee Hendel, Mor Geva, Amir Globerson

  • Ambiguity-Aware In-Context Learning with Large Language Models, arXiv, 2309.07900, arxiv, pdf, cication: -1

    Lingyu Gao, Aditi Chaudhary, Krishna Srinivasan, Kazuma Hashimoto, Karthik Raman, Michael Bendersky

  • FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning, arXiv, 2309.04663, arxiv, pdf, cication: -1

    Xinyi Wang, John Wieting, Jonathan H. Clark

  • RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models, arXiv, 2308.07922, arxiv, pdf, cication: 1

    Jie Huang, Wei Ping, Peng Xu, Mohammad Shoeybi, Kevin Chen-Chuan Chang, Bryan Catanzaro

  • CausalLM is not optimal for in-context learning, arXiv, 2308.06912, arxiv, pdf, cication: 2

    Nan Ding, Tomer Levinboim, Jialin Wu, Sebastian Goodman, Radu Soricut

  • FLIRT: Feedback Loop In-context Red Teaming, arXiv, 2308.04265, arxiv, pdf, cication: 3

    Ninareh Mehrabi, Palash Goyal, Christophe Dupuy, Qian Hu, Shalini Ghosh, Richard Zemel, Kai-Wei Chang, Aram Galstyan, Rahul Gupta

  • Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models, arXiv, 2308.00304, arxiv, pdf, cication: 1

    Jiaao Chen, Xiaoman Pan, Dian Yu, Kaiqiang Song, Xiaoyang Wang, Dong Yu, Jianshu Chen

  • Large Language Models as General Pattern Machines, arXiv, 2307.04721, arxiv, pdf, cication: 23

    Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng

  • Trained Transformers Learn Linear Models In-Context, arXiv, 2306.09927, arxiv, pdf, cication: 26

    Ruiqi Zhang, Spencer Frei, Peter L. Bartlett · (jiqizhixin)

  • Learning to Retrieve In-Context Examples for Large Language Models, arXiv, 2307.07164, arxiv, pdf, cication: 1

    Liang Wang, Nan Yang, Furu Wei

  • Understanding In-Context Learning via Supportive Pretraining Data, arXiv, 2306.15091, arxiv, pdf, cication: 5

    Xiaochuang Han, Daniel Simig, Todor Mihaylov, Yulia Tsvetkov, Asli Celikyilmaz, Tianlu Wang

Other

Knowledge graph

  • Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph, arXiv, 2307.07697, arxiv, pdf, cication: 9

    Jiashuo Sun, Chengjin Xu, Lumingyuan Tang, Saizhuo Wang, Chen Lin, Yeyun Gong, Lionel M. Ni, Heung-Yeung Shum, Jian Guo · (ToG - IDEA-FinAI) Star · (mp.weixin.qq)

  • Unifying Large Language Models and Knowledge Graphs: A Roadmap, arXiv, 2306.08302, arxiv, pdf, cication: 53

    Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, Xindong Wu

    · (jiqizhixin)

  • KoLA: Carefully Benchmarking World Knowledge of Large Language Models, arXiv, 2306.09296, arxiv, pdf, cication: 14

    Jifan Yu, Xiaozhi Wang, Shangqing Tu, Shulin Cao, Daniel Zhang-Li, Xin Lv, Hao Peng, Zijun Yao, Xiaohan Zhang, Hanming Li

Other

Tutorials

Extra reference