Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

机译：视觉和语言导航：在真实环境中解释视觉导航指令

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A robot that can carry out a natural-language instruction has been a dream since before the Jetsons cartoon series imagined a life of leisure mediated by a fleet of attentive robot helpers. It is a dream that remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress in closely related areas. This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering. Both tasks can be interpreted as visually grounded sequence-to-sequence translation problems, and many of the same methods are applicable. To enable and encourage the application of vision and language methods to the problem of interpreting visually-grounded navigation instructions, we present the Matter-port3D Simulator - a large-scale reinforcement learning environment based on real imagery [11]. Using this simulator, which can in future support a range of embodied vision and language tasks, we provide the first benchmark dataset for visually-grounded natural language navigation in real buildings - the Room-to-Room (R2R) dataset1.

机译：一个机器人可以在杰尔斯卡通系列想象一下由一队专注机器人帮助者介绍的休闲寿命之前，这是一个自然语义的梦想。这是一个难以顽固地居住的梦想。然而，最近的愿景和语言方法的进展在密切相关的地区取得了令人难以置信的进展。这是重要的，因为一个机器人在其所看到的基础上解释自然语言导航指令正在进行类似于视觉和语言过程，类似于类似于视觉问题的回答。这两个任务都可以解释为视觉接地的序列到序列翻译问题，并且许多相同的方法是适用的。为了启用和鼓励应用视觉和语言方法，以解释视觉接地导航指令的问题，我们介绍了基于真实图像的大规模加强学习环境[11]。使用此模拟器可以在将来支持一系列体现的视觉和语言任务中，我们为真实建筑物的视觉上接地的自然语言导航提供了第一个基准数据集 - 房间到室（R2R）DataSet1。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|731p|共10页
会议地点
作者
Peter Anderson; Qi Wu; Damien Teney; Jake Bruce; Mark Johnson; Niko Sünderhauf; Ian Reid; Stephen Gould; Anton van den Hengel;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词
Navigation; Task analysis; Robots; Visualization; Cameras; Three-dimensional displays; Natural languages;

机译：导航;任务分析;机器人;可视化;摄像机;三维显示器;自然语言;

相似文献

外文文献
中文文献
专利

1. One way to guide them all: Wayfinding strategies and the examination of gender-specific navigational instructions in a real-driving context [J] . Schoedel Ramona, Hilbert Sven, Buhner Markus, Transportation research . 2018,第OCTa期

机译：一种指导所有人的方法：寻路策略和在真实驾驶环境中检查针对特定性别的导航说明
2. Neurocognitive Treatment for a Patient with Alzheimer's Disease Using a Virtual Reality Navigational Environment: [J] . Paul J.F. White, Zahra Moussavi Journal of Experimental Neuroscience . 2016,第3期

机译：使用虚拟现实导航环境对阿尔茨海默氏病患者的神经认知治疗：
3. Neurocognitive Treatment for a Patient with Alzheimer's Disease Using a Virtual Reality Navigational Environment: [J] . Paul J.F. White, Zahra Moussavi Journal of Experimental Neuroscience . 2016,第3期

机译：使用虚拟现实导航环境对阿尔茨海默氏病患者的神经认知治疗：
4. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments [C] . Peter Anderson, Qi Wu, Damien Teney, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：视觉和语言导航：解释真实环境中的视觉地面导航说明
5. Fast, Real-Time Robot Navigation in Initially Unknown Environments via Cross-Domain Transfer Learning of Options [D] . Saha, Olimpiya. 2018

机译：通过选项的跨域传输学习，在最初未知的环境中进行快速，实时的机器人导航
6. Neurocognitive Treatment for a Patient with Alzheimer’s Disease Using a Virtual Reality Navigational Environment [O] . Paul J.F. White, Zahra Moussavi 2016

机译：使用虚拟现实导航环境对阿尔茨海默氏病患者的神经认知治疗
7. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments [O] . Peter Anderson, Qi Wu, Damien Teney, 2018

机译：视觉和语言导航：在真实环境中解释视觉导航指令
8. Improving Real World Performance of Vision Aided Navigation in a Flight Environment. [R] . Venable, D. T. 2016

机译：提高飞行环境中视觉辅助导航的实际性能。

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

摘要

著录项

相似文献

相关主题

期刊订阅