...
首页> 外文期刊>Pattern recognition letters >Video text recognition using sequential Monte Carlo and error voting methods
【24h】

Video text recognition using sequential Monte Carlo and error voting methods

机译:使用顺序蒙特卡洛和错误投票方法的视频文本识别

获取原文
获取原文并翻译 | 示例
           

摘要

This paper addresses the issue of segmentation and recognition of text embedded in video sequences from their associated text image sequence extracted by a text detection module. To this end, we propose a probabilistic algorithm based on Bayesian adaptive thresholding and Monte-Carlo sampling. The algorithm approximates the posterior distribution of segmentation thresholds of text pixels in an image by a set of weighted samples. The set of samples is initialized by applying a classical segmentation algorithm on the first video frame and further refined by random sampling under a temporal Bayesian framework. One important contribution of the paper is to show that, thanks to the proposed methodology, the likelihood of a segmentation parameter sample can be estimated not using a classification criterion or a visual quality criterion based on the produced segmentation map, but directly from the induced text recognition result, which is directly relevant to our task. Furthermore, as a second contribution of the paper, we propose to align text recognition results from high confidence samples gathered over time, to composite a final result using error voting technique (ROVER) at the character level. Experiments are conducted on a two hour video database. Character recognition rates higher than 93%, and word error rates higher than 90% are achieved, which are 4% and 3% more than state-of-the-art methods applied to the same database.
机译:本文解决了从文本检测模块提取的视频序列中嵌入的相关文本图像序列中对文本进行分割和识别的问题。为此,我们提出了一种基于贝叶斯自适应阈值和蒙特卡洛采样的概率算法。该算法通过一组加权样本来近似图像中文本像素的分割阈值的后验分布。通过在第一个视频帧上应用经典分割算法来初始化样本集,然后在时间贝叶斯框架下通过随机采样进一步完善样本集。论文的一项重要贡献是表明,借助所提出的方法,可以不使用分类准则或视觉质量准则基于生成的分割图来估计分割参数样本的可能性,而可以直接从诱导文本中进行估计识别结果,与我们的任务直接相关。此外,作为本文的第二个贡献,我们建议将随着时间推移而收集的高可信度样本的文本识别结果对齐,以使用字符级别的错误投票技术(ROVER)合成最终结果。实验是在两个小时的视频数据库上进行的。实现了高于93%的字符识别率和高于90%的单词错误率,分别比应用于同一数据库的最新方法高出4%和3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号