针对少数类样本合成过采样技术(synthetic minority over-sampling technique,SMOTE)在合成少数类新样本时会带来噪声问题,提出了一种改进降噪自编码神经网络不平衡数据分类算法(SMOTE-SDAE).该算法通过SMOTE方法合成少数类新样本以均衡原始数据集,考虑到合成样本过程中会产生噪声的影响,利用降噪自编码神经网络算法的逐层无监督降噪学习和有监督微调过程,有效实现对过采样数据集的降噪处理与数据分类.在UCI不平衡数据集上实验结果表明,相比传统SVM算法,该算法显著提高了不平衡数据集中少数类的分类精度.%Aiming at the noise problems of SMOTE algorithm when synthesizing new minority class samples,this paper proposed a stacked de-noising auto-encoder neural network algorithm based on SMOTE,SMOTE-SDAE.The proposed algorithm balanced the original data sets by using SMOTE to synthesize new minority class samples,and then effectively de-noises and classifies the oversampling data sets through the layer-by-layer unsupervised de-noise learning and supervised fine-tuning process of de-noising auto-encoder neural network given the impact of noise produced in the process of synthesizing samples.Experimental results on UCI imbalanced data sets indicate that compared with traditional SVM algorithms,SMOTE-SDAE algorithm significantly improves the minority class classification accuracy of the imbalanced data sets.
展开▼