首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks
【24h】

A database of unconstrained Vietnamese online handwriting and recognition experiments by recurrent neural networks

机译:经常性神经网络的无约束越南网上手写和识别实验数据库

获取原文
获取原文并翻译 | 示例
           

摘要

We present our efforts to create a database of unconstrained Vietnamese online handwritten text sampled from pen-based devices. The database stores handwritten text for paragraphs, lines, words, and characters, with the ground truth associated with every paragraph and line. We show a detailed statistical analysis of the handwritten text in this database and describe recognition experiments using several recent methods including the Bidirectional Long Short-Term Memory (BLSTM) network. Overall, our database contains over 480,000 strokes from more than 380,000 characters, which, at present, is the largest database of Vietnamese online handwritten text. Although Vietnamese script is based on a fixed set of alphabet letters, the recognition of Vietnamese online handwritten text poses a difficult challenge because of many diacritical marks, which usually result in delayed strokes during writing. We designed and implemented an online handwriting-collection tool to gather data, as well as a line-segmentation tool and a delayed-stroke-detection tool to analyze collected handwritten text. We also conducted a statistical analysis based on the writer profiles. We applied a number of the state-of-the-art recognition methods on unconstrained Vietnamese handwriting to evaluate their performance, including the BLSTM network, which is an efficient architecture derived from the Recurrent Neural Network (RNN) and is often applied to sequence labeling problems. The BLSTM network achieved 90% character recognition accuracy, despite many long sequences with several delayed strokes. Our database is allowed open access for research to stimulate the development of handwriting research technology. (C) 2018 Elsevier Ltd. All rights reserved.
机译:我们介绍了从基于笔的设备采样采样的无约会越南网手写文本的数据库。数据库存储手写文本,用于段落,行,单词和字符,与每个段落和行相关联的地面真相。我们对该数据库中的手写文本进行了详细的统计分析,并使用包括双向长短期内存(BLSTM)网络的几种方法来描述识别实验。总的来说,我们的数据库包含超过38万个字符的480,000个笔划,目前是越南在线手写文本中最大的数据库。虽然越南脚本基于固定的字母表字母,但越南网手写文本的识别因许多变音标记而造成艰难的挑战,这通常会导致写作期间的延迟笔划。我们设计并实现了一个在线手写收集工具,可以收集数据,以及线路分割工具和延迟行程检测工具来分析收集的手写文本。我们还基于作者配置文件进行了统计分析。我们应用了一些关于无约束的越南手写的最先进的识别方法,以评估它们的性能,包括BLSTM网络,它是从经常性神经网络(RNN)的有效架构,并且通常应用于序列标记问题。尽管具有多个延迟笔画,但是BLSTM网络尽管许多长序列实现了90%的性格识别准确性。我们的数据库是允许开放式访问,以刺激手写研究技术的开发。 (c)2018年elestvier有限公司保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号