首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Keyword spotting for self-training of BLSTM NN based handwriting recognition systems
【24h】

Keyword spotting for self-training of BLSTM NN based handwriting recognition systems

机译:基于LSTM CNN的手写识别系统自训练的关键字识别

获取原文
获取原文并翻译 | 示例
           

摘要

The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes.
机译:不受约束的连续手写文本的自动转录需要训练有素的识别系统。半监督范例引入了不仅在学习过程中使用标记数据,而且使用未标记数据的概念。未加标签的数据可以花很少或很少的钱收集。因此,它有可能减少对训练数据加标签的需求,这是一个乏味且昂贵的过程。给定在标签数据上受过训练的较弱的初始识别器,则可以使用自训练来识别未标记的数据,并以高置信度将被识别的单词添加到训练集中进行重新训练。对于选择要添加到训练集中的元素,此过程并非易事,需要格外小心。在本文中,我们建议使用双向长短期记忆神经网络手写识别系统进行关键字识别,以选择新元素。一组实验表明,对于现代手写笔迹和历史手写笔迹,自训练对于自举手写识别系统具有很高的潜力,并证明了使用关键字查找技术优于以前发布的自训练方案的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号