王浩岑 郑朝文 赵卓峰
北方工业大学 信息学院 大规模流数据重点实验室,北京,100144;
摘要:大型语言模型(LLMs)通过分析大量文本数据,积累了丰富的语言使用模式和世界知识,成为理解和生成自然语言的关键工具。在机器人领域,如何将这些知识转化为机器人的任务规划能力,是一个研究热点。可以利用预训练的语言模型,通过适当的提示,将抽象任务分解为机器人可执行的动作。且在无额外训练的情况下,LLMs可以生成可执行的行动计划,如提示词足够充分,LLMs能够将高级任务转化为逻辑连贯的行动步骤。以上,不难看出LLMs在机器人行动规划中的潜力。然而,这些行动计划在现实世界中的可执行性仍面临挑战,意味着将语言模型知识转化为机器人行动需要进一步研究。本文提出一种基于上下文学习、利用大模型的能力,进一步探索和改进前人方法,提高机器人任务规划的准确性和可执行性的方法。
关键词:上下文学习;大模型;任务规划
参考文献
[1] Huang W, Abbeel P, Pathak D, et al. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents[C]//International conference on machine learning. PMLR, 2022: 9118-9147.
[2] Puig X, Ra K, Boben M, et al. Virtualhome: Simulating household activities via programs[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 8494-8502.
[3] Ahn M, Brohan A, Brown N, et al. Do as i can, not as i say: Grounding language in robotic affordances[J]. arxiv preprint arxiv:2204.01691, 2022.
[4] Liu R, Wei J, Gu S S, et al. Mind's eye:Grounded language model reasoning through simulation[J].arxiv preprint arxiv:2210.05359,2022.
[5] Song C H, Wu J, Washington C, et al. Llm-planner: Few-shot grounded planning for embodied agents with large language models[C]//Proceedings of the IEEE/CVF international conference on computer vision. 2023: 2998-3009.
[6] Gu Y, Deng X, Su Y. Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments[J]. arxiv preprint arxiv:2212.09736, 2022.
[7] Nottingham K, Ammanabrolu P, Suhr A, et al. Do embodied agents dream of pixelated sheep: Embodied decision making using language guided world modelling[C]//International Conference on Machine Learning. PMLR, 2023: 26311-26325.
[8] Wang Z, Cai S, Chen G, et al. Describe, explain, plan and select: Interactive planning with large language models enables open-world multi-task agents[J]. arxiv preprint arxiv:2302.01560, 2023.