首页> 外文期刊>IEEE multimedia >A Crossmodal Approach to Multimodal Fusion in Video Hyperlinking
【24h】

A Crossmodal Approach to Multimodal Fusion in Video Hyperlinking

机译:视频超链接中的一种多模态融合的交叉模态方法

获取原文
获取原文并翻译 | 示例
           

摘要

With the recent resurgence of neural networks and the proliferation of massive amounts of unlabeled multimodal data, recommendation systems and multimodal retrieval systems based on continuous representation spaces and deep learning methods are becoming of great interest. Multimodal representations are typically obtained with autoencoders that reconstruct multimodal data. In this article, we describe an alternative method to perform high-level multimodal fusion that leverages crossmodal translation by means of symmetrical encoders cast into a bidirectional deep neural network (BiDNN). Using the lessons learned from multimodal retrieval, we present a BiDNN-based system that performs video hyperlinking and recommends interesting video segments to a viewer. Results established using TRECVIDs 2016 video hyperlinking benchmarking initiative show that our method obtained the best score, thus defining the state of the art.
机译:随着神经网络的兴起和大量未标记的多模式数据的泛滥,基于连续表示空间和深度学习方法的推荐系统和多模式检索系统变得越来越受关注。通常使用重构多峰数据的自动编码器获得多峰表示。在本文中,我们描述了一种执行高级多模式融合的替代方法,该方法通过将对称编码器转换为双向深度神经网络(BiDNN)来利用交叉模式转换。利用从多模式检索中获得的经验教训,我们介绍了一个基于BiDNN的系统,该系统执行视频超链接并向观众推荐有趣的视频片段。使用TRECVIDs 2016视频超链接基准测试计划建立的结果表明,我们的方法获得了最高分,从而定义了最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号