Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents (+Author)

#gpt3 #embodied #planning In this video: Paper explanation, followed by first author interview with Wenlong Huang. Large language models contain extraordinary amounts of world knowledge that can be queried in various ways. But their output format is largely uncontrollable. This paper investigates the VirtualHome environment, which expects a particular set of actions, objects, and verbs to be used. Turns out, with proper techniques and only using pre-trained models (no fine-tuning), one can translate unstructured language model outputs into the structured grammar of the environment. This is potentially very useful anywhere where the models’ world knowledge needs to be provided in a particular structured format. OUTLINE: 0:00 - Intro & Overview 2:45 - The VirtualHome environment 6:25 - The problem of plan evaluation 8:40 - Contributions of this paper 16:40 - Start of interview 24:00 - How to use language models with environments? 34:00 - What does model size matter?

2 views