...
首页> 外文期刊>Journal of applied mathematics >A Novel Two-Stage Spectrum-Based Approach for Dimensionality Reduction: A Case Study on the Recognition of Handwritten Numerals
【24h】

A Novel Two-Stage Spectrum-Based Approach for Dimensionality Reduction: A Case Study on the Recognition of Handwritten Numerals

机译:一种新的基于两阶段频谱的降维方法:以手写数字识别为例

获取原文
           

摘要

Dimensionality reduction (feature selection) is an important step in pattern recognition systems. Although there are different conventional approaches for feature selection, such as Principal Component Analysis, Random Projection, and Linear Discriminant Analysis, selecting optimal, effective, and robust features is usually a difficult task. In this paper, a new two-stage approach for dimensionality reduction is proposed. This method is based on one-dimensional and two-dimensional spectrum diagrams of standard deviation and minimum to maximum distributions for initial feature vector elements. The proposed algorithm is validated in an OCR application, by using two big standard benchmark handwritten OCR datasets, MNIST and Hoda. In the beginning, a 133-element feature vector was selected from the most used features, proposed in the literature. Finally, the size of initial feature vector was reduced from 100% to 59.40% (79 elements) for the MNIST dataset, and to 43.61% (58 elements) for the Hoda dataset, in order. Meanwhile, the accuracies of OCR systems are enhanced 2.95% for the MNIST dataset, and 4.71% for the Hoda dataset. The achieved results show an improvement in the precision of the system in comparison to the rival approaches, Principal Component Analysis and Random Projection. The proposed technique can also be useful for generating decision rules in a pattern recognition system using rule-based classifiers.
机译:降维(特征选择)是模式识别系统中的重要一步。尽管有多种常规的特征选择方法,例如主成分分析,随机投影和线性判别分析,但是选择最佳,有效和鲁棒的特征通常是一项艰巨的任务。本文提出了一种新的两阶段降维方法。此方法基于初始特征向量元素的标准偏差和最小到最大分布的一维和二维频谱图。通过使用两个大型标准基准手写OCR数据集MNIST和Hoda,在OCR应用程序中对提出的算法进行了验证。首先,从文献中提出的最常用的特征中选择一个133个元素的特征向量。最终,对于MNIST数据集,初始特征向量的大小从100%减少到59.40%(79个元素),对于Hoda数据集,其大小减少到43.61%(58个元素)。同时,对于MNIST数据集,OCR系统的精度提高了2.95%,对于Hoda数据集,提高了4.71%。与竞争对手的方法,主成分分析和随机投影相比,所获得的结果表明该系统的精度有所提高。所提出的技术对于使用基于规则的分类器在模式识别系统中生成决策规则也可能有用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号