首页> 外文会议>International Conference on Frontiers in Handwriting Recognition >Applications of Recurrent Neural Network Language Model in Offline Handwriting Recognition and Word Spotting
【24h】

Applications of Recurrent Neural Network Language Model in Offline Handwriting Recognition and Word Spotting

机译:递归神经网络语言模型在离线手写识别和词点识别中的应用

获取原文

摘要

The recurrent neural network language model (RNNLM) is a discriminative, non-Markovian model that can capture long-span word history in natural language. It has been proved to be successful in automatic speech recognition and machine translation. In this work, we applied RNNLM to the n-best rescoring stage of the state-of-the-art BBN Byblos OCR (optical character recognition) system for handwriting recognition.1 With RNNLM scores as additional features, our system achieved significant improvement (p < 0.001), a 3.5% relative reduction on OCR word error rate, compared with a high baseline that uses n-gram language model for rescoring. We have also developed a novel method to integrate the OCR n-best RNNLM scores into the ord posterior probabilities in OCR confusion networks, which resulted in consistent observable improvements in word spotting for OCR'ed handwritten documents, as measured by both mean average precision (MAP) and detection-error tradeoff (DET) curves.
机译:递归神经网络语言模型(RNNLM)是一种可区分的非马尔可夫模型,可以捕获自然语言中的大跨度单词历史。它已被证明在自动语音识别和机器翻译中是成功的。在这项工作中,我们将RNNLM应用于最先进的BBN Byblos OCR(光学字符识别)系统进行手写识别的n个最佳记录阶段。1通过将RNNLM得分作为附加功能,我们的系统取得了显着改进( p <0.001),与使用n-gram语言模型进行记录的较高基线相比,OCR字错误率相对降低了3.5%。我们还开发了一种新颖的方法,可以将OCR n最佳RNNLM得分整合到OCR混淆网络中的ord后验概率中,从而通过OCR手写文档的平均平均精度( MAP)和检测错误权衡(DET)曲线。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号