首页> 外文会议>IEEE Winter Conference on Applications of Computer Vision >Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meaningful Reward
【24h】

Weakly Supervised Deep Reinforcement Learning for Video Summarization With Semantically Meaningful Reward

机译:通过语义上有意义的奖励进行视频摘要的弱势增强学习

获取原文

摘要

Conventional unsupervised video summarization algorithms are usually developed in a frame level clustering manner For example, frame level diversity and representativeness are two typical clustering criteria used for unsupervised reinforcement learning-based video summarization. Inspired by recent progress in video representation techniques, we further introduce the similarity of video representations to construct a semantically meaningful reward for this task. We consider that a good summarization should also be semantically identical to its original source, which means that the semantic similarity can be regarded as an additional criterion for summarization. Through combining a novel video semantic reward with other unsupervised rewards for training, we can easily upgrade an unsupervised reinforcement learning-based video summarization method to its weakly supervised version. In practice, we first train a video classification sub-network (VCSN) to extract video semantic representations based on a category-labeled video dataset. Then we fix this VCSN and train a summary generation sub-network (SGSN) using unlabeled video data in a reinforcement learning way. Experimental results demonstrate that our work significantly surpasses other unsupervised and even supervised methods. To the best of our knowledge, our method achieves state-of-the-art performance in terms of the correlation coefficients, Kendall's and Spearman's p.
机译:传统的无监督视频摘要算法通常以帧级聚类方式开发,例如,帧级分集和代表性是用于无监督的增强学习的视频概述的两个典型的聚类标准。灵感来自近期视频表示技术的进步,我们进一步介绍了视频表示的相似性来构建针对此任务的语义有意义的奖励。我们认为,良好的摘要也应该在语义上与其原始来源相同,这意味着语义相似性可以被视为概括的额外标准。通过将新的视频语义奖励与其他无人监督的训练进行结合,我们可以轻松升级无监督的加固学习的视频摘要方法,以其弱监督的版本。在实践中,我们首先培训视频分类子网络(VCSN)以基于标记为标记的视频数据集提取视频语义表示。然后,我们将此VCSN修复并在加强学习方式中使用未标记的视频数据训练摘要生成子网(SGSN)。实验结果表明,我们的工作显着超越了其他无人汶甚至监督的方法。据我们所知,我们的方法在相关系数,肯德尔和Spearman的p方面取得了最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号