首页> 外国专利> System and methods for arabic text recognition based on effective arabic text feature extraction

System and methods for arabic text recognition based on effective arabic text feature extraction

机译:基于有效阿拉伯文字特征提取的阿拉伯文字识别系统和方法

摘要

A method for automatically recognizing Arabic text includes building an Arabic corpus comprising Arabic text files written in different writing styles and ground truths corresponding to each of the Arabic text files, storing writing-style indices in association with the Arabic text files, digitizing an Arabic word to form an array of pixels, dividing the Arabic word into line images, forming a text feature vector from the line images, training a Hidden Markov Model using the Arabic text files and ground truths in the Arabic corpus in accordance with the writing-style indices, and feeding the text feature vector into a Hidden Markov Model to recognize the Arabic words.
机译:一种自动识别阿拉伯文本的方法,包括建立阿拉伯语语料库,该阿拉伯语语料库包括以不同的书写风格书写的阿拉伯语文本文件和与每个阿拉伯语文本文件相对应的基本事实,与阿拉伯语文本文件关联存储书写风格索引,数字化阿拉伯语单词形成一个像素数组,将阿拉伯语单词划分为线条图像,从线条图像形成文本特征向量,并根据写作风格索引使用阿拉伯语文本文件和阿拉伯语语料库中的地面实况训练隐马尔可夫模型,然后将文本特征向量输入到隐马尔可夫模型中以识别阿拉伯语单词。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号