首页> 外国专利> System and methods for arabic text recognition and arabic corpus building

System and methods for arabic text recognition and arabic corpus building

机译：阿拉伯文本识别和阿拉伯语语料库构建的系统和方法

页面导航

摘要
著录项
相似文献

摘要

A method for automatically recognizing Arabic text includes building an Arabic dataset comprising Arabic text files written in different writing styles and actual meanings of the Arabic text corresponding to each of the Arabic text files (100), storing writing-style indices in association with the Arabic text files (105), digitizing a line of Arabic characters to form an array of pixels (321-323) (130), dividing the line of the Arabic characters into line images, (311-313) (120), forming a text feature vector from the line images (311-313) (140), training a Hidden Markov Model using the Arabic text files and ground truths in the Arabic dataset in accordance with the writing-style indices (160), and feeding the text feature vector into a Hidden Markov Model to recognize the line of Arabic characters (170).

机译：一种自动识别阿拉伯文本的方法，包括建立包括以不同书写风格书写的阿拉伯文本文件和与每个阿拉伯文本文件（100）相对应的阿拉伯文本的实际含义的阿拉伯数据集，与阿拉伯文本相关联地存储书写样式索引文本文件（105），将阿拉伯字符行数字化以形成像素阵列（321-323）（130），将阿拉伯字符行分成线图像（311-313）（120），形成文本线图像中的特征向量（311-313）（140），根据书写风格索引使用阿拉伯文本文件和阿拉伯数据集中的地面真理训练隐马尔可夫模型（160），并提供文本特征向量进入隐马尔可夫模型以识别阿拉伯字符行（170）。

著录项

公开/公告号EP2804131A3

专利类型
公开/公告日2016-09-07

原文格式PDF
申请/专利权人 KING ABDULAZIZ CITY FOR SCIENCE AND TECHNOLOGY;
展开▼

申请/专利号EP20130184319
发明设计人 KHORSHEED MOHAMMAD S.;AL-OMARI HUSSEIN K.;OSFOOR MAJED IBRAHIM BIN;ALOBAID ABDULAZIZ OBAID;ALFALEH HUSSAM ABDULRAHMAN;ASFOUR ARWA IBRAHEM BIN;
展开▼

申请日2013-09-13
分类号G06K9/68;G06K9/62;
国家 EP
入库时间 2022-08-21 14:51:53

相似文献

专利
外文文献
中文文献