-
Generative Expressive Robot Behaviors using Large Language Models,
arXiv, 2401.14673
, arxiv, pdf, cication: -1Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh · (generative-expressive-motion.github)
-
Adaptive Mobile Manipulation for Articulated Objects In the Open World,
arXiv, 2401.14403
, arxiv, pdf, cication: -1Haoyu Xiong, Russell Mendonca, Kenneth Shaw, Deepak Pathak · (open-world-mobilemanip.github)
-
OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics,
arXiv, 2401.12202
, arxiv, pdf, cication: -1Peiqi Liu, Yaswanth Orru, Chris Paxton, Nur Muhammad Mahi Shafiullah, Lerrel Pinto · (ok-robot.github)
-
AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents
· (auto-rt.github)
-
Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis,
arXiv, 2312.08782
, arxiv, pdf, cication: -1Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao
-
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent,
arXiv, 2312.10003
, arxiv, pdf, cication: -1Renat Aksitov, Sobhan Miryoosefi, Zonglin Li, Daliang Li, Sheila Babayan, Kavya Kopparapu, Zachary Fisher, Ruiqi Guo, Sushant Prakash, Pranesh Srinivasan
-
Vision-Language Models as a Source of Rewards,
arXiv, 2312.09187
, arxiv, pdf, cication: -1Kate Baumli, Satinder Baveja, Feryal Behbahani, Harris Chan, Gheorghe Comanici, Sebastian Flennerhag, Maxime Gazeau, Kristian Holsheimer, Dan Horgan, Michael Laskin
-
Foundation Models in Robotics: Applications, Challenges, and the Future,
arXiv, 2312.07843
, arxiv, pdf, cication: -1Roya Firoozi, Johnathan Tucker, Stephen Tian, Anirudha Majumdar, Jiankai Sun, Weiyu Liu, Yuke Zhu, Shuran Song, Ashish Kapoor, Karol Hausman
· (Awesome-Robotics-Foundation-Models - robotics-survey)
-
SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions,
arXiv, 2312.01307
, arxiv, pdf, cication: -1Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang, Leonidas Guibas · (geometry.stanford) · (SAGE - geng-haoran)
-
Agent as Cerebrum, Controller as Cerebellum: Implementing an Embodied LMM-based Agent on Drones,
arXiv, 2311.15033
, arxiv, pdf, cication: -1Haoran Zhao, Fengxing Pan, Huqiuyue Ping, Yaoming Zhou · (qbitai)
-
From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3",
arXiv, 2312.06571
, arxiv, pdf, cication: -1Takahide Yoshida, Atsushi Masumori, Takashi Ikegami · (tnoinkwms.github)
-
Controllable Human-Object Interaction Synthesis,
arXiv, 2312.03913
, arxiv, pdf, cication: -1Jiaman Li, Alexander Clegg, Roozbeh Mottaghi, Jiajun Wu, Xavier Puig, C. Karen Liu
-
Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia,
arXiv, 2312.03664
, arxiv, pdf, cication: -1Alexander Sasha Vezhnevets, John P. Agapiou, Avia Aharon, Ron Ziv, Jayd Matyas, Edgar A. Duéñez-Guzmán, William A. Cunningham, Simon Osindero, Danny Karmon, Joel Z. Leibo
· (concordia - google-deepmind)
-
Vision-Language Foundation Models as Effective Robot Imitators,
arXiv, 2311.01378
, arxiv, pdf, cication: -1Xinghang Li, Minghuan Liu, Hanbo Zhang, Cunjun Yu, Jie Xu, Hongtao Wu, Chilam Cheang, Ya Jing, Weinan Zhang, Huaping Liu · (RoboFlamingo - RoboFlamingo)
· (jiqizhixin)
-
GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration,
arXiv, 2311.12015
, arxiv, pdf, cication: -1Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi · (microsoft.github)
· (jiqizhixin)
-
Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections,
arXiv, 2311.10678
, arxiv, pdf, cication: -1Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh
-
GOAT: GO to Any Thing,
arXiv, 2311.06430
, arxiv, pdf, cication: -1Matthew Chang, Theophile Gervet, Mukul Khanna, Sriram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra
-
LLaMA Rider: Spurring Large Language Models to Explore the Open World, arXiv, 2310.08922, arxiv, pdf, cication: -1
Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, Zongqing Lu · [jiqizhixin]
-
RoboVQA: Multimodal Long-Horizon Reasoning for Robotics, arXiv, 2311.00899, arxiv, pdf, cication: -1
Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil J Joshi
-
Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning, arXiv, 2310.20587, arxiv, pdf, cication: -1
Ruizhe Shi, Yuyao Liu, Yanjie Ze, Simon S. Du, Huazhe Xu
-
Large Language Models as Generalizable Policies for Embodied Tasks, arXiv, 2310.17722, arxiv, pdf, cication: -1
Andrew Szot, Max Schwarzer, Harsh Agrawal, Bogdan Mazoure, Walter Talbott, Katherine Metcalf, Natalie Mackraz, Devon Hjelm, Alexander Toshev
-
Creative Robot Tool Use with Large Language Models, arXiv, 2310.13065, arxiv, pdf, cication: -1
Mengdi Xu, Peide Huang, Wenhao Yu, Shiqi Liu, Xilun Zhang, Yaru Niu, Tingnan Zhang, Fei Xia, Jie Tan, Ding Zhao
-
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning, arXiv, 2310.12921, arxiv, pdf, cication: -1
Juan Rocamonde, Victoriano Montesinos, Elvis Nava, Ethan Perez, David Lindner
-
Eureka: Human-Level Reward Design via Coding Large Language Models, arXiv, 2310.12931, arxiv, pdf, cication: 1
Yecheng Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi Fan, Anima Anandkumar · [Eureka - eureka-research]
· [qbitai]
-
Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1
Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik
-
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control, arXiv, 2307.00117, arxiv, pdf, cication: 3
Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine · [bair.berkeley]
-
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency, arXiv, 2309.17382, arxiv, pdf, cication: -1
Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang · [RAFA_code - agentification]
-
Video Language Planning, arXiv, 2310.10625, arxiv, pdf, cication: -1
Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum
-
Interactive Task Planning with Language Models, arXiv, 2310.10645, arxiv, pdf, cication: -1
Boyi Li, Philipp Wu, Pieter Abbeel, Jitendra Malik
-
FireAct: Toward Language Agent Fine-tuning, arXiv, 2310.05915, arxiv, pdf, cication: -1
Baian Chen, Chang Shu, Ehsan Shareghi, Nigel Collier, Karthik Narasimhan, Shunyu Yao
-
LangNav: Language as a Perceptual Representation for Navigation, arXiv, 2310.07889, arxiv, pdf, cication: -1
Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim
-
Learning Interactive Real-World Simulators, arXiv, 2310.06114, arxiv, pdf, cication: -1
Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Dale Schuurmans, Pieter Abbeel · [universal-simulator.github]
-
GenSim: Generating Robotic Simulation Tasks via Large Language Models, arXiv, 2310.01361, arxiv, pdf, cication: -1
Lirui Wang, Yiyang Ling, Zhecheng Yuan, Mohit Shridhar, Chen Bao, Yuzhe Qin, Bailin Wang, Huazhe Xu, Xiaolong Wang
-
A Data Source for Reasoning Embodied Agents, AAAI, 2023, arxiv, pdf, cication: 1
Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam
-
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping, CoRL, 2023, arxiv, pdf, cication: 3
Adam Rashid, Satvik Sharma, Chung Min Kim, Justin Kerr, Lawrence Chen, Angjoo Kanazawa, Ken Goldberg
-
Thought Cloning: Learning to Think while Acting by Imitating Human Thinking, arXiv, 2306.00323, arxiv, pdf, cication: 4
Shengran Hu, Jeff Clune
-
Physically Grounded Vision-Language Models for Robotic Manipulation, arXiv, 2309.02561, arxiv, pdf, cication: 1
Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh
-
BEVBert: Multimodal Map Pre-training for Language-guided Navigation, arXiv, 2212.04385, arxiv, pdf, cication: 11
Dong An, Yuankai Qi, Yangguang Li, Yan Huang, Liang Wang, Tieniu Tan, Jing Shao · [vln-bevbert - marsaki]
-
Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation, arXiv, 2308.07931, arxiv, pdf, cication: 3
William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola · [qbitai] · [jiqizhixin]
-
Foundation Model based Open Vocabulary Task Planning and Executive System for General Purpose Service Robots, arXiv, 2308.03357, arxiv, pdf, cication: 1
Yoshiki Obinata, Naoaki Kanazawa, Kento Kawaharazuka, Iori Yanokura, Soonhyo Kim, Kei Okada, Masayuki Inaba
-
Learning to Model the World with Language, arXiv, 2308.01399, arxiv, pdf, cication: 4
Jessy Lin, Yuqing Du, Olivia Watkins, Danijar Hafner, Pieter Abbeel, Dan Klein, Anca Dragan · [mp.weixin.qq]
-
Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI, arXiv, 2308.05221, arxiv, pdf, cication: 1
Hangjie Shi, Leslie Ball, Govind Thattai, Desheng Zhang, Lucy Hu, Qiaozi Gao, Suhaila Shakiah, Xiaofeng Gao, Aishwarya Padmakumar, Bofei Yang
-
Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition, arXiv, 2307.14535, arxiv, pdf, cication: 9
Huy Ha, Pete Florence, Shuran Song
-
RT-2: Vision-Language-Action Models
· [qbitai] · [robotics-transformer2.github]
-
Towards A Unified Agent with Foundation Models, workshop on reincarnating reinforcement learning at iclr 2023, 2023, arxiv, pdf, cication: 9
Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess, Martin Riedmiller
-
SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Robot Task Planning, arXiv, 2307.06135, arxiv, pdf, cication: 13
Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, Niko Suenderhauf
-
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models, arXiv, 2307.05973, arxiv, pdf, cication: 35
Wenlong Huang, Chen Wang, Ruohan Zhang, Yunzhu Li, Jiajun Wu, Li Fei-Fei · [voxposer.github] · [mp.weixin.qq]
-
Decomposing the Generalization Gap in Imitation Learning for Visual Robotic Manipulation, arXiv, 2307.03659, arxiv, pdf, cication: 3
Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn
-
Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners, arXiv, 2307.01928, arxiv, pdf, cication: 24
Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley
-
Building Cooperative Embodied Agents Modularly with Large Language Models, arXiv, 2307.02485, arxiv, pdf, cication: 5
Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan
-
ChatGPT for Robotics: Design Principles and Model Abilities, microsoft auton. syst. robot. res, 2023, arxiv, pdf, cication: 111
Sai Vemprala, Rogerio Bonatti, Arthur Bucker, Ashish Kapoor · [PromptCraft-Robotics - microsoft]
-
Statler: State-Maintaining Language Models for Embodied Reasoning, arXiv, 2306.17840, arxiv, pdf, cication: 5
Takuma Yoneda, Jiading Fang, Peng Li, Huanyu Zhang, Tianchong Jiang, Shengjie Lin, Ben Picker, David Yunis, Hongyuan Mei, Matthew R. Walter
-
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction, arXiv, 2306.15724, arxiv, pdf, cication: 13
Zeyi Liu, Arpit Bahety, Shuran Song
-
ViNT: A Foundation Model for Visual Navigation, arXiv, 2306.14846, arxiv, pdf, cication: 10
Dhruv Shah, Ajay Sridhar, Nitish Dashora, Kyle Stachowicz, Kevin Black, Noriaki Hirose, Sergey Levine · [visualnav-transformer.github]
-
HomeRobot: Open-Vocabulary Mobile Manipulation, arXiv, 2306.11565, arxiv, pdf, cication: 5
Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner
-
Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27
Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik
-
SayTap: Language to Quadrupedal Locomotion, arXiv, 2306.07580, arxiv, pdf, cication: 4
Yujin Tang, Wenhao Yu, Jie Tan, Heiga Zen, Aleksandra Faust, Tatsuya Harada
-
ChessGPT: Bridging Policy Learning and Language Modeling, arXiv, 2306.09200, arxiv, pdf, cication: 2
Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang
-
Language to Rewards for Robotic Skill Synthesis, arXiv, 2306.08647, arxiv, pdf, cication: 27
Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik
-
Embodied Executable Policy Learning with Language-based Scene Summarization, arXiv, 2306.05696, arxiv, pdf, cication: -1
Jielin Qiu, Mengdi Xu, William Han, Seungwhan Moon, Ding Zhao
-
GPT Models Meet Robotic Applications: Co-Speech Gesturing Chat System, arXiv, 2306.01741, arxiv, pdf, cication: -1
Naoki Wake, Atsushi Kanehira, Kazuhiro Sasabuchi, Jun Takamatsu, Katsushi Ikeuchi
-
Reflexion: Language Agents with Verbal Reinforcement Learning,
arXiv, 2303.11366
, arxiv, pdf, cication: 110Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, Shunyu Yao
-
ReAct: Synergizing Reasoning and Acting in Language Models,
arXiv, 2210.03629
, arxiv, pdf, cication: 293Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, Yuan Cao
-
Co-LLM-Agents - UMass-Foundation-Model
Source codes for the paper "Building Cooperative Embodied Agents Modularly with Large Language Models" · [qbitai]
- Awesome-Robotics-Foundation-Models - robotics-survey