首页> 外文会议>International Conference on Intelligent Systems and Control >Text recognition in bilingual machine printed image documents — Challenges and survey: A review on principal and crucial concerns of text extraction in bilingual printed images

【24h】

Text recognition in bilingual machine printed image documents — Challenges and survey: A review on principal and crucial concerns of text extraction in bilingual printed images

机译：双语机器印刷图像文档中的文本识别—挑战和调查：双语印刷图像中文本提取的主要和关键问题的综述

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this digital world, accurate text identification and recognition has become an important key area of image document analysis and processing. Textual data, ranging from simple to complex images along with language variations - mono, bi, tri or multilingual scripts, is identified and extracted. This paper is designed to focus the challenges and complex issues of text recognition in bilingual machine printed imaged documents. Major crucial factors are discovered and mentioned which become the bottlenecks in correct and accurate recognition. With this, a hierarchical structure depicting three Classification Schemes (CS) A, B and C of bilingual printed imaged document is shown, where A, B and C are related to the content form, image mining and language or script determination. Some loopholes of OCR working are also discussed. To analyze the existing algorithms and methods, a survey is presented to focus on their critical issues, proposed solutions along with constraints and errors found during text processing. It leads to find out the shortcomings and limitations of different methods. Various specifications and factors found from the techniques are also shown as their characteristics and are compared relatively to distinguish them. It is observed that most of the existing methods are based on the classification schemes CS A-A1 and C-C1 and C2 and are designed for the script identification with 300 dpi gray scale image using SVM classifier.

机译：在这个数字世界中，准确的文本识别和识别已成为图像文档分析和处理的重要关键领域。识别并提取文本数据，从简单到复杂的图像以及语言变化（单，双，三或多语言脚本）。本文旨在重点关注双语机器打印的成像文档中文本识别的挑战和复杂问题。发现并提到了主要的关键因素，这些因素成为正确正确识别的瓶颈。这样，示出了描述双语印刷成像文档的三个分类方案（CS）A，B和C的分层结构，其中A，B和C与内容形式，图像挖掘以及语言或脚本确定有关。还讨论了OCR工作的一些漏洞。为了分析现有的算法和方法，我们进行了一项调查，重点关注它们的关键问题，建议的解决方案以及在文本处理过程中发现的约束和错误。它导致找出不同方法的缺点和局限性。从技术中发现的各种规格和因素也被显示为它们的特性，并进行了比较以区别它们。可以看出，大多数现有方法都基于分类方案CS A-A1和C-C1和C2，并被设计用于使用SVM分类器识别300 dpi灰度图像的脚本。

著录项

来源
《International Conference on Intelligent Systems and Control》|2016年|1-8|共8页
会议地点
作者
Shalini Puri; Satya Prakash Singh;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Two dimensional displays; Manganese;

机译：二维显示;锰;

相似文献

外文文献
中文文献
专利

1. Local features-based script recognition from printed bilingual document images [J] . S. Abirami, D. Manjula International Journal of Computer Applications in Technology . 2010,第4期

机译：从打印的双语文档图像中基于本地特征的脚本识别
2. Text string extraction from images of colour-printed documents [J] . Suen H.-M., Wang J.-F. IEE Proceedings. Part K . 1996,第4期

机译：从彩色打印文档的图像中提取文本字符串
3. Machine printed text and handwriting identification in noisy document images [J] . Yefeng Zheng, Huiping Li, Doermann D. IEEE Transactions on Pattern Analysis and Machine Intelligence . 2004,第3期

机译：在嘈杂的文档图像中进行机器打印的文本和手写识别
4. Text recognition in bilingual machine printed image documents — Challenges and survey: A review on principal and crucial concerns of text extraction in bilingual printed images [C] . Shalini Puri, Satya Prakash Singh International Conference on Intelligent Systems and Control . 2016

机译：双语机器印刷图像文件中的文本识别 - 挑战和调查：双语印刷图像中文案提取的主要和关键问题综述
5. Extraction of Text Objects in Image and Video Documents. [D] . Zhang, Jing. 2012

机译：提取图像和视频文档中的文本对象。
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. Extraction of Text Areas in Printed Document Images [O] . Jean Duong, Myriam Côté, Hubert Emptoz, 2001

机译：打印文档图像中文本区域的提取
8. Machine Printed Text and Handwriting Identification in Noisy Document Images [R] . Zheng, Y. , Li, H. , Doermann, D. 2003

机译：嘈杂文档图像中的机器打印文本和手写识别

Text recognition in bilingual machine printed image documents — Challenges and survey: A review on principal and crucial concerns of text extraction in bilingual printed images

摘要

著录项

相似文献

相关主题

期刊订阅