...
首页> 外文期刊>International Journal of Computer Trends and Technology >Analyzing Word Error Rate on Optical Character Recognition (OCR) for Myanmar Printed Document Image
【24h】

Analyzing Word Error Rate on Optical Character Recognition (OCR) for Myanmar Printed Document Image

机译:分析缅甸印刷文档图像的光学字符识别(OCR)的单词错误率

获取原文
           

摘要

The printed document is used Myanmar language in Myanmar. Sometime, we want to convert this printed document to text document easily. So, this paper describes an effective recognition and calculate error rate for Myanmar printed document image to editing text. Myanmar language contains many words, and most of them are similar, especially for small fonts, the accuracy of the Optical Character Recognition, OCR system for Myanmar may be low. In order to get more accurate system, enhance the input image by removing noise and making some correction on variants. A method for isolation of the character image is proposed by using connected component analysis for wrongly segmented characters produced by projection only. So, this paper proposes a method for obtaining more detail about actual translation errors in the generated output by using word error rate (WER) based the neural network classifier for recognition of the character image. We investigate the use of WER for automatic error analysis using a dynamic programming algorithm like Levenshtein distance over segmentation. This paper gives a better overview of the nature of translation errors. Finally, the proposed algorithms have been tested on a variety of Myanmar printed documents and the results of the experiments indicate that the methods can reduce the segmentation error rate as well as translation rates.
机译:打印的文档在缅甸使用缅甸语。有时,我们希望将此打印文档轻松转换为文本文档。因此,本文描述了缅甸印刷文档图像编辑文本的有效识别方法,并计算出错误率。缅甸语言包含许多单词,并且大多数单词是相似的,尤其是对于小字体,缅甸的光学字符识别OCR系统的准确性可能较低。为了获得更准确的系统,请通过消除噪声并对变体进行一些校正来增强输入图像。提出了一种通过对仅投影产生的错误分割字符进行连通成分分析的字符图像隔离方法。因此,本文提出了一种方法,该方法利用基于神经网络分类器的单词错误率(WER)来识别字符图像,从而获得有关生成的输出中实际翻译错误的更多详细信息。我们研究了使用动态规划算法(如Levenshtein距离分段分割)对WER进行自动错误分析的方法。本文对翻译错误的性质进行了更好的概述。最后,在多种缅甸印刷文件上对提出的算法进行了测试,实验结果表明该方法可以降低分割错误率和翻译率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号