首页> 外文会议>International Congress on Image and Signal Processing, BioMedical Engineering and Informatics >Segmentation-Free Multi-Font Printed Manchu Word Recognition Using Deep Convolutional Features and Data Augmentation
【24h】

Segmentation-Free Multi-Font Printed Manchu Word Recognition Using Deep Convolutional Features and Data Augmentation

机译:使用深度卷积特征和数据增强的无分割多字体印刷满族文字识别

获取原文

摘要

Precise Manchu character segmentation of segmentation-based Manchu recognition methods is difficult to realize because of complex Manchu language spelling rules and existence multi Manchu fonts. To avoid the influence of incorrect segmentation, this work proposes the idea of segmentation-free recognition to recognize Manchu word instead of Manchu characters. In addition, an end-to-end 9-layer convolutional neural network is proposed to automatically extract deep hierarchy features on Manchu word image. The proposed recognition model is applied on Manchu words with 12 Manchu fonts to evaluate its ability of multi-font recognition. Deep neural network needs massive data for training, whereas Manchu language is an endangered language lacking in document data. To solve this contradiction, this work firstly builds a Manchu dataset prototype and a multi-font Manchu word testing set, and then designs a data augmentation system to generate synthetic data for training. The data augmentation system contains 7 generation methods, including character structure distortion and image quality transformation. Experiments demonstrate the proposed convolutional neural network for Manchu word recognition achieves a new state-of-the-art accuracy on multi-font printed Manchu word. For printed Manchu fonts, the highest recognition accuracy reaches 0.95; the lowest accuracy is 0.88; the average accuracy of printed Manchu fonts reaches 0.91. Experiments also demonstrate the proposed data augmentation system is an effective way to solve insufficient data problem.
机译:由于复杂的满族语言拼写规则和存在多种满族字体,难以实现基于分割的满族识别方法的精确满族字符分割。为了避免不正确的分割的影响,这项工作提出了一种无分割识别的思想,可以识别满族单词而不是满族字符。此外,提出了一种端到端的9层卷积神经网络,以自动提取满族单词图像上的深层次特征。将所提出的识别模型应用于具有12种满族字体的满族单词,以评估其多字体识别能力。深度神经网络需要大量数据来进行训练,而满族语言是文档数据中缺少的一种濒临灭绝的语言。为解决这一矛盾,本文首先建立了满族数据集原型和多字体满族单词测试集,然后设计了数据增强系统以生成用于训练的综合数据。数据增强系统包含7种生成方法,包括字符结构失真和图像质量转换。实验表明,所提出的用于满族单词识别的卷积神经网络在多字体打印满族单词上实现了新的最新精度。对于印刷的满族字体,最高的识别精度达到0.95;最低精度为0.88;印刷的满族字体的平均精度达到0.91。实验还表明,提出的数据增强系统是解决数据不足问题的有效方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号