首页> 外文会议>Internet Imaging VII; Electronic Imaging Science and Technology >TRECVID: The Utility of a Content-based Video Retrieval Evaluation
【24h】

TRECVID: The Utility of a Content-based Video Retrieval Evaluation

机译:TRECVID:基于内容的视频检索评估的实用程序

获取原文
获取原文并翻译 | 示例

摘要

TRECVID, an annual retrieval evaluation benchmark organized by MIST, encourages research in information retrieval from digital video. TRECVID benchmarking covers both interactive and manual searching by end users, as well as the benchmarking of some supporting technologies including shot boundary detection, extraction of semantic features, and the automatic segmentation of TV news broadcasts. Evaluations done in the context of the TRECVID benchmarks show that generally, speech transcripts and annotations provide the single most important clue for successful retrieval. However, automatically finding the individual images is still a tremendous and unsolved challenge. The evaluations repeatedly found that none of the multimedia analysis and retrieval techniques provide a significant benefit over retrieval using only textual information such as from automatic speech recognition transcripts or closed captions. In interactive systems, we do find significant differences among the top systems, indicating that interfaces can make a huge difference for effective video/image search. For interactive tasks efficient interfaces require few key clicks, but display large numbers of images for visual inspection by the user. The text search finds the right context region in the video in general, but to select specific relevant images we need good interfaces to easily browse the storyboard pictures. In general, TRECVID has motivated the video retrieval community to be honest about what we don't know how to do well (sometimes through painful failures), and has focused us to work on the actual task of video retrieval, as opposed to flashy demos based on technological capabilities.
机译:由MIST组织的年度检索评估基准TRECVID鼓励对数字视频信息检索进行研究。 TRECVID基准测试涵盖了最终用户的交互式搜索和手动搜索,以及一些支持技术的基准测试,包括镜头边界检测,语义特征提取以及电视新闻广播的自动分段。在TRECVID基准测试中进行的评估表明,通常,语音记录和注释为成功检索提供了最重要的单一线索。但是,自动查找单个图像仍然是一个巨大而尚未解决的挑战。这些评估反复发现,与仅使用文本信息(例如来自自动语音识别笔录或隐藏式字幕的文本信息)进行检索相比,没有一种多媒体分析和检索技术可提供显着的好处。在交互系统中,我们确实发现顶级系统之间存在显着差异,这表明界面可以对有效的视频/图像搜索产生巨大的影响。对于交互式任务,高效的界面只需要单击几下鼠标,即可显示大量图像供用户进行视觉检查。文本搜索通常会在视频中找到合适的上下文区域,但是要选择特定的相关图像,我们需要良好的界面来轻松浏览情节提要图片。总的来说,TRECVID激发了视频检索社区对我们不知道如何做得好(有时会经历痛苦的​​失败)诚实的态度,并且使我们专注于视频检索的实际任务,而不是华丽的演示基于技术能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号