Skip to content

Latest commit

 

History

History
254 lines (179 loc) · 26.2 KB

awesome_robotics_llm.md

File metadata and controls

254 lines (179 loc) · 26.2 KB

Awesome-robotics-llm

Papers

  • Generative Expressive Robot Behaviors using Large Language Models, arXiv, 2401.14673, arxiv, pdf, cication: -1

    Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh · (generative-expressive-motion.github)

  • Adaptive Mobile Manipulation for Articulated Objects In the Open World, arXiv, 2401.14403, arxiv, pdf, cication: -1

    Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak · (open-world-mobilemanip.github)

  • OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics, arXiv, 2401.12202, arxiv, pdf, cication: -1

    Peiqi Liu, Yaswanth Orru, Chris Paxton, Nur Muhammad Mahi Shafiullah, Lerrel Pinto · (ok-robot.github)

  • AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

    · (auto-rt.github)

  • Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis, arXiv, 2312.08782, arxiv, pdf, cication: -1

    Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao

  • ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent, arXiv, 2312.10003, arxiv, pdf, cication: -1

    Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan

  • Vision-Language Models as a Source of Rewards, arXiv, 2312.09187, arxiv, pdf, cication: -1

    Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin

  • Foundation Models in Robotics: Applications, Challenges, and the Future, arXiv, 2312.07843, arxiv, pdf, cication: -1

    Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman

    · (Awesome-Robotics-Foundation-Models - robotics-survey) Star

  • SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions, arXiv, 2312.01307, arxiv, pdf, cication: -1

    Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas · (geometry.stanford) · (SAGE - geng-haoran) Star

  • Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones, arXiv, 2311.15033, arxiv, pdf, cication: -1

    Haoran Zhao, Fengxing Pan, Huqiuyue Ping, Yaoming Zhou · (qbitai)

  • From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3", arXiv, 2312.06571, arxiv, pdf, cication: -1

    Takahide Yoshida, Atsushi Masumori, Takashi Ikegami · (tnoinkwms.github)

  • Controllable Human-Object Interaction Synthesis, arXiv, 2312.03913, arxiv, pdf, cication: -1

    Jiaman Li, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig, C. Karen Liu

  • Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia, arXiv, 2312.03664, arxiv, pdf, cication: -1

    Alexander Sasha Vezhnevets, John P. Agapiou, Avia Aharon, Ron Ziv, Jayd Matyas, Edgar A. Duéñez-Guzmán, William A. Cunningham, Simon Osindero, Danny Karmon, Joel Z. Leibo

    · (concordia - google-deepmind) Star

  • Vision-Language Foundation Models as Effective Robot Imitators, arXiv, 2311.01378, arxiv, pdf, cication: -1

    Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu · (RoboFlamingo - RoboFlamingo) Star · (jiqizhixin)

  • GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration, arXiv, 2311.12015, arxiv, pdf, cication: -1

    Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi · (microsoft.github)

    · (jiqizhixin)

  • Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections, arXiv, 2311.10678, arxiv, pdf, cication: -1

    Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh

  • GOAT: GO to Any Thing, arXiv, 2311.06430, arxiv, pdf, cication: -1

    Matthew Chang, Theophile Gervet, Mukul Khanna, Sriram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra

  • LLaMA Rider: Spurring Large Language Models to Explore the Open World, arXiv, 2310.08922, arxiv, pdf, cication: -1

    Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, Zongqing Lu · [jiqizhixin]

  • RoboVQA: Multimodal Long-Horizon Reasoning for Robotics, arXiv, 2311.00899, arxiv, pdf, cication: -1

    Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi

  • Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning, arXiv, 2310.20587, arxiv, pdf, cication: -1

    Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu

  • Large Language Models as Generalizable Policies for Embodied Tasks, arXiv, 2310.17722, arxiv, pdf, cication: -1

    Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev

  • Creative Robot Tool Use with Large Language Models, arXiv, 2310.13065, arxiv, pdf, cication: -1

    Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, Yaru Niu, Tingnan Zhang, Fei Xia, Jie Tan, Ding Zhao

  • Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning, arXiv, 2310.12921, arxiv, pdf, cication: -1

    Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, David Lindner

  • Eureka: Human-Level Reward Design via Coding Large Language Models, arXiv, 2310.12931, arxiv, pdf, cication: 1

    Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar · [Eureka - eureka-research] Star · [qbitai]

  • Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1

    Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

  • Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control, arXiv, 2307.00117, arxiv, pdf, cication: 3

    Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine · [bair.berkeley]

  • Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency, arXiv, 2309.17382, arxiv, pdf, cication: -1

    Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang · [RAFA_code - agentification] Star

  • Video Language Planning, arXiv, 2310.10625, arxiv, pdf, cication: -1

    Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum

  • Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1

    Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik

  • FireAct: Toward Language Agent Fine-tuning, arXiv, 2310.05915, arxiv, pdf, cication: -1

    Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao

  • LangNav: Language as a Perceptual Representation for Navigation, arXiv, 2310.07889, arxiv, pdf, cication: -1

    Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim

  • Learning Interactive Real-World Simulators, arXiv, 2310.06114, arxiv, pdf, cication: -1

    Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Dale Schuurmans, Pieter Abbeel · [universal-simulator.github]

  • GenSim: Generating Robotic Simulation Tasks via Large Language Models, arXiv, 2310.01361, arxiv, pdf, cication: -1

    Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang

  • A Data Source for Reasoning Embodied Agents, AAAI, 2023, arxiv, pdf, cication: 1

    Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

  • Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping, CoRL, 2023, arxiv, pdf, cication: 3

    Adam Rashid, Satvik Sharma, Chung Min Kim, Justin Kerr, Lawrence Chen, Angjoo Kanazawa, Ken Goldberg

  • Thought Cloning: Learning to Think while Acting by Imitating Human Thinking, arXiv, 2306.00323, arxiv, pdf, cication: 4

    Shengran Hu, Jeff Clune

  • Physically Grounded Vision-Language Models for Robotic Manipulation, arXiv, 2309.02561, arxiv, pdf, cication: 1

    Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh

  • BEVBert: Multimodal Map Pre-training for Language-guided Navigation, arXiv, 2212.04385, arxiv, pdf, cication: 11

    Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, Jing Shao · [vln-bevbert - marsaki] Star

  • Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation, arXiv, 2308.07931, arxiv, pdf, cication: 3

    William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola · [qbitai] · [jiqizhixin]

  • Foundation Model based Open Vocabulary Task Planning and Executive System for General Purpose Service Robots, arXiv, 2308.03357, arxiv, pdf, cication: 1

    Yoshiki Obinata, Naoaki Kanazawa, Kento Kawaharazuka, Iori Yanokura, Soonhyo Kim, Kei Okada, Masayuki Inaba

  • Learning to Model the World with Language, arXiv, 2308.01399, arxiv, pdf, cication: 4

    Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan · [mp.weixin.qq]

  • Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI, arXiv, 2308.05221, arxiv, pdf, cication: 1

    Hangjie Shi, Leslie Ball, Govind Thattai, Desheng Zhang, Lucy Hu, Qiaozi Gao, Suhaila Shakiah, Xiaofeng Gao, Aishwarya Padmakumar, Bofei Yang

  • Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition, arXiv, 2307.14535, arxiv, pdf, cication: 9

    Huy Ha, Pete Florence, Shuran Song

  • RT-2: Vision-Language-Action Models

    · [qbitai] · [robotics-transformer2.github]

  • Towards A Unified Agent with Foundation Models, workshop on reincarnating reinforcement learning at iclr 2023, 2023, arxiv, pdf, cication: 9

    Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess, Martin Riedmiller

  • SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning, arXiv, 2307.06135, arxiv, pdf, cication: 13

    Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, Niko Suenderhauf

  • VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models, arXiv, 2307.05973, arxiv, pdf, cication: 35

    Wenlong Huang, Chen Wang, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Li Fei-Fei · [voxposer.github] · [mp.weixin.qq]

  • Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation, arXiv, 2307.03659, arxiv, pdf, cication: 3

    Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn

  • Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners, arXiv, 2307.01928, arxiv, pdf, cication: 24

    Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley

  • Building Cooperative Embodied Agents Modularly with Large Language Models, arXiv, 2307.02485, arxiv, pdf, cication: 5

    Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan

  • ChatGPT for Robotics: Design Principles and Model Abilities, microsoft auton. syst. robot. res, 2023, arxiv, pdf, cication: 111

    Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor · [PromptCraft-Robotics - microsoft] Star

  • Statler: State-Maintaining Language Models for Embodied Reasoning, arXiv, 2306.17840, arxiv, pdf, cication: 5

    Takuma Yoneda, Jiading Fang, Peng Li, Huanyu Zhang, Tianchong Jiang, Shengjie Lin, Ben Picker, David Yunis, Hongyuan Mei, Matthew R. Walter

  • REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction, arXiv, 2306.15724, arxiv, pdf, cication: 13

    Zeyi Liu, Arpit Bahety, Shuran Song

  • ViNT: A Foundation Model for Visual Navigation, arXiv, 2306.14846, arxiv, pdf, cication: 10

    Dhruv Shah, Ajay Sridhar, Nitish Dashora, Kyle Stachowicz, Kevin Black, Noriaki Hirose, Sergey Levine · [visualnav-transformer.github]

  • HomeRobot: Open-Vocabulary Mobile Manipulation, arXiv, 2306.11565, arxiv, pdf, cication: 5

    Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner

  • Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27

    Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik

  • SayTap: Language to Quadrupedal Locomotion, arXiv, 2306.07580, arxiv, pdf, cication: 4

    Yujin Tang, Wenhao Yu, Jie Tan, Heiga Zen, Aleksandra Faust, Tatsuya Harada

  • ChessGPT: Bridging Policy Learning and Language Modeling, arXiv, 2306.09200, arxiv, pdf, cication: 2

    Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang

  • Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27

    Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik

  • Embodied Executable Policy Learning with Language-based Scene Summarization, arXiv, 2306.05696, arxiv, pdf, cication: -1

    Jielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao

  • GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System, arXiv, 2306.01741, arxiv, pdf, cication: -1

    Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi

  • Reflexion: Language Agents with Verbal Reinforcement Learning, arXiv, 2303.11366, arxiv, pdf, cication: 110

    Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao

  • ReAct: Synergizing Reasoning and Acting in Language Models, arXiv, 2210.03629, arxiv, pdf, cication: 293

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao

Projects

  • Co-LLM-Agents - UMass-Foundation-Model Star

    Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models" · [qbitai]

Other

Reference