A novel framework for automatic sorting of postal documents with multi-script address blocks

Basu S.; Das N.; Sarkar R.; Kundu M.; Nasipuri M.; Kumar Basu D.

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >A novel framework for automatic sorting of postal documents with multi-script address blocks

【24h】

A novel framework for automatic sorting of postal documents with multi-script address blocks

机译：具有多脚本地址块的自动分类邮政文档的新颖框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recognition of numeric postal codes in a multi-script environment is a classical problem in any postal automation system. In such postal documents, determination of the script of the handwritten postal codes is crucial for subsequent invocation of the digit recognizers for respective scripts. The current framework attempts to infer about the script of the numeric postal code without having any bias from the script of the textual address part of the rest of the address block, as they might differ in a potential multi-script environment. Scope of the current work is to recognize the postal codes written in any of the four popular scripts, viz., Latin, Devanagari, Bangla and Urdu. For this purpose, we first implement a Hough transformation based technique to localize the postal-code blocks from structured postal documents with defined address block region. Isolated handwritten digit patterns are then extracted from the localized postal-code region. In the next stage of the developed framework, similar shaped digit patterns of the said four scripts are grouped in 25 clusters. A script independent unified pattern classifier is then designed to classify the numeric postal codes into one of these 25 clusters. Based on these classification decisions a rule-based script inference engine is designed to infer about the script of the numeric postal code. One of the four script specific classifiers is subsequently invoked to recognize the digit patterns of the corresponding script. A novel quad-tree based image partitioning technique is also developed in this work for effective feature extraction from the numeric digit patterns. The average recognition accuracy over ten-fold cross validation of results for the support vector machine (SVM) based 25-class unified pattern classifier is obtained as 92.03%. With randomly selected six-digit numeric strings of four different scripts; an average of 96.72% script inference accuracy is achieved. The average of tenfold cross-validation recognition accuracies of the individual SVM classifiers for the Latin, Devanagari, Bangla and Urdu numerals are observed as 95.55%, 95.63%, 97.15% and 96.20%, respectively.

机译：在多脚本环境中识别数字邮政编码是任何邮政自动化系统中的经典问题。在这种邮政文件中，确定手写邮政编码的脚本对于随后调用各个脚本的数字识别器至关重要。当前框架试图推断数字邮政编码的脚本，而不会与地址块其余部分的文本地址部分的脚本有任何偏差，因为它们在潜在的多脚本环境中可能会有所不同。目前的工作范围是识别以四种流行文字中的任何一种书写的邮政编码，即拉丁语，梵文，孟加拉语和乌尔都语。为此，我们首先实现基于Hough变换的技术，以从具有定义的地址块区域的结构化邮政文档中定位邮政编码块。然后从本地邮政编码区域中提取孤立的手写数字模式。在开发框架的下一阶段，将上述四个脚本的相似形状的数字模式分组为25个簇。然后设计独立于脚本的统一模式分类器，以将数字邮政编码分类为这25个聚类之一。基于这些分类决策，设计了一个基于规则的脚本推断引擎，以推断数字邮政编码的脚本。随后调用四个特定于脚本的分类器之一，以识别相应脚本的数字模式。在这项工作中还开发了一种新颖的基于四叉树的图像分割技术，可以从数字模式中有效提取特征。基于支持向量机（SVM）的25类统一模式分类器的结果在十次交叉验证中的平均识别准确度为92.03％。具有四个不同脚本的随机选择的六位数字字符串;平均可以达到96.72％的脚本推断精度。拉丁文，梵文，孟加拉文和乌尔都文数字的各个SVM分类器的十倍交叉验证识别准确性的平均值分别为95.55％，95.63％，97.15％和96.20％。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2010年第10期|共15页
作者
Basu S.; Das N.; Sarkar R.; Kundu M.; Nasipuri M.; Kumar Basu D.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Automatic mail sorting; Handwritten numeral recognition; Multi-script postal address block; Quad-tree based feature extraction; Script identification for numerals; Support vector machine;

机译：自动邮件分类;手写数字识别;多脚本邮政地址块;基于四叉树的特征提取;数字脚本识别;支持向量机;

相似文献

外文文献
中文文献
专利

1. A novel framework for automatic sorting of postal documents with multi-script address blocks [J] . Basu S., Das N., Sarkar R., Pattern Recognition: The Journal of the Pattern Recognition Society . 2010,第10期

机译：具有多脚本地址块的自动分类邮政文档的新颖框架
2. Automatic Letter Sorting for Indian Postal Address Recognition System Based on Pin Codes [J] . Velu C.M. Periyasamy Vivekanandan Computer Sciences and Telecommunications . 2010,第2期

机译：基于PIN码的印度邮政地址识别系统自动字母分类
3. Automatic letter sorting for Indian Postal Address Recognition System based on PIN codes [J] . C. M. Velu Journal of Internet and Information Systems . 2010,第1期

机译：基于PIN码的印度邮政地址识别系统的自动字母分类
4. Recognition of Numeric Postal Codes from Multi-script Postal Address Blocks [C] . Subhadip Basil, Nibaran Das, Ram Sarkar, Pattern recognition and machine intelligence . 2009

机译：从多脚本邮政地址块中识别数字邮政编码
5. A multimodal fusion approach for automatic postal address recognition system using Optical Character Recognition (OCR) and Automatic Speech Recognition (ASR) techniques. [D] . Singh, Amriteshwar. 2011

机译：一种使用光学字符识别（OCR）和自动语音识别（ASR）技术的自动邮政地址识别系统的多模式融合方法。
6. Patient reporting of complications after surgery: what impact does documenting postoperative problems from the perspective of the patient using telephone interview and postal questionnaires have on the identification of complications after surgery? [O] . John Woodfield, Priya Deo, Ann Davidson, 2019

机译：患者对手术后并发症的报告：从患者的角度使用电话访谈和邮政调查表记录术后问题对手术后并发症的识别有何影响？
7. Automatic Chinese Postal Address Block Location Using Component-wise Proximity Descriptors and Random Forests [O] . Dong, X, Dong, J, Zhou, Huiyu, 2017

机译：使用逐点接近描述符和随机森林的自动中文邮政地址块定位

A novel framework for automatic sorting of postal documents with multi-script address blocks

摘要

著录项

相似文献

相关主题

期刊订阅