【24h】

Word recognition in a segmentation-free approach to OCR

机译:无分段OCR方法中的单词识别

获取原文

摘要

Segmentation is a key step in current OCR systems. It has beenestimated that half the errors in character recognition are due tosegmentation. A novel approach that performs OCR without thesegmentation step was developed. The approach starts by extractingsignificant geometric features from the input document image of thepage. Each feature then votes for the character that could havegenerated that feature. Thus, even if some of the features are occludedor lost due to degradation, the remaining features can successfullyidentify the character. In extreme cases, the degradation may be severeenough to prevent recognition of some of the characters in a word. Insuch cases, a lexicon-based word recognition technique is used toresolve ambiguity. Inexact matching and probabilistic evaluation used inthe technique make it possible to identify the correct word, bydetecting a partial set of characters. The authors first present anoverview of their segmentation-free OCR system and then focus on theword recognition technique. Preliminary experimental results show thatthis is a very promising approach
机译:分割是当前OCR系统中的关键步骤。它一直 估计一半的字符识别错误是由于 分割。一种无需OCR即可执行OCR的新颖方法 开发了细分步骤。该方法从提取开始 输入文档图像的重要几何特征 页。然后,每个功能都会为可能具有以下特征的角色投票 生成了该功能。因此,即使某些功能被遮挡 或由于降级而丢失,其余功能可以成功 识别角色。在极端情况下,降级可能会很严重 足以防止识别单词中的某些字符。在 在这种情况下,使用基于词典的单词识别技术来 解决歧义。不精确匹配和概率评估用于 该技术使通过以下方式识别正确的单词成为可能 检测部分字符。作者首先提出了一个 概述其无分段的OCR系统,然后着重于 单词识别技术。初步实验结果表明 这是一个非常有前途的方法

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号