首页> 外国专利> CHARACTER SHAPE FEATURE DICTIONARY PRODUCING APPARATUS, IMAGE DOCUMENT PROCESSING APPARATUS COMPRISING THE SAME, CHARACTER SHAPE FEATURE DICTIONARY PRODUCING PROGRAM, RECORDING MEDIUM WITH CHARACTER SHAPE FEATURE DICTIONARY PRODUCING PROGRAM RECORDED THEREON, IMAGE DOCUMENT PROCESSING PROGRAM, AND RECORDING MEDIUM WITH IMAGE DOCUMENT PROCESSING PROGRAM RECORDED THEREON

CHARACTER SHAPE FEATURE DICTIONARY PRODUCING APPARATUS, IMAGE DOCUMENT PROCESSING APPARATUS COMPRISING THE SAME, CHARACTER SHAPE FEATURE DICTIONARY PRODUCING PROGRAM, RECORDING MEDIUM WITH CHARACTER SHAPE FEATURE DICTIONARY PRODUCING PROGRAM RECORDED THEREON, IMAGE DOCUMENT PROCESSING PROGRAM, AND RECORDING MEDIUM WITH IMAGE DOCUMENT PROCESSING PROGRAM RECORDED THEREON

机译:字符形状特征词典生产设备,包含相同内容的图像文档处理设备,字符形状特征词典生产记录程序,记录的图像和特征记录的过程,记录的图像,所记录的图像

摘要

PPROBLEM TO BE SOLVED: To provide a character shape feature dictionary producing apparatus, in which searching accuracy is enhanced by improving character feature extracting method, and to provide an image document processing apparatus providing the same. PSOLUTION: An image of a character string composed of M characters in an image document is cut out, then image feature of each character image is extracted by dividing them for each character, then N (N1, integer) character images are selected as a candidate character in the descending order of matching degree from a character shape feature dictionary which stores the image feature of the character image by character based on the image feature, then a first MN index matrix corresponding to the number of cut out character strings is produced. A second index matrix in which the candidate string is adjusted to an understandable character string by applying lexical analysis using a predetermined language model to the candidate character string composed of a plurality of candidate characters which configure a first line of the first index matrix is produced and used for searching. PCOPYRIGHT: (C)2009,JPO&INPIT
机译:

要解决的问题:提供一种字符形状特征字典产生设备,其中通过改进字符特征提取方法来提高搜索精度,并且提供一种提供该图像形状的图像文件处理设备。

解决方案:剪切图像文档中由M个字符组成的字符串的图像,然后通过将每个字符图像除以每个字符来提取每个字符图像的图像特征,然后提取N(N> 1,整数)个字符图像从字符形状特征字典中选择匹配程度从高到低的候选字符,该字符形状特征字典基于图像特征逐个字符地存储字符图像的图像特征,然后对应于切出的字符数的第一MN索引矩阵弦产生。产生第二索引矩阵,在第二索引矩阵中,通过对构成第一索引矩阵的第一行的多个候选字符组成的候选字符串应用预定语言模型进行词法分析,从而将候选字符串调整为可理解的字符串,并且用于搜索。

版权:(C)2009,日本特许厅&INPIT

著录项

  • 公开/公告号JP2009026289A

    专利类型

  • 公开/公告日2009-02-05

    原文格式PDF

  • 申请/专利权人 SHARP CORP;

    申请/专利号JP20070246159

  • 申请日2007-09-21

  • 分类号G06K9/62;G06F17/30;

  • 国家 JP

  • 入库时间 2022-08-21 19:40:06

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号