首页> 中文期刊> 《计算机应用》 >基于多类指数损失函数逐步添加模型的改进多分类AdaBoost算法

基于多类指数损失函数逐步添加模型的改进多分类AdaBoost算法

         

摘要

多类指数损失函数逐步添加模型(SAMME)是一种多分类的AdaBoost算法,为进一步提升SAMME算法的性能,针对使用加权概率和伪损失对算法的影响进行研究,在此基础上提出了一种基于基分类器对样本有效邻域分类的动态加权AdaBoost算法SAMME.RD.首先,确定是否使用加权概率和伪损失;然后,求出待测样本在训练集中的有效邻域;最后,根据基分类器针对有效邻域的分类结果确定基分类器的加权系数.使用UCI数据集进行验证,实验结果表明:使用真实的错误率计算基分类器加权系数效果更好;在数据类别较少且分布平衡时,使用真实概率进行基分类器筛选效果较好;在数据类别较多且分布不平衡时,使用加权概率进行基分类器筛选效果较好.所提的SAMME.RD算法可以有效提高多分类AdaBoost算法的分类正确率.%Stagewise Additive Modeling using a Multi-class Exponential loss function (SAMME) is a multi-class AdaBoost algorithm.To further improve the performance of SAMME,the influence of using weighed error rate and pseudo loss on SAMME algorithm was studied,and a dynamic weighted Adaptive Boosting (AdaBoost) algorithm named SAMME with Resampling and Dynamic weighting (SAMME.RD) algorithm was proposed based on the classification of sample's effective neighborhood area by using the base classifier.Firstly,it was determined that whether to use weighted probability and pseudo loss or not.Then,the effective neighborhood area of sample to be tested in the training set was found out.Finally,the weighted coefficient of the base classifier was determined according to the classification result of the effective neighborhood area based on the base classifier.The experimental results show that,the effect of calculating the weighted coefficient of the base classifier by using real error rate is better.The performance of selecting base classifier by using real probability is better when the dataset has less classes and its distribution is balanced.The performance of selecting base classifier by using weighed probability is better when the dataset has more classes and its distribution is imbalanced.The proposed SAMME.RD algorithm can improve the multi-class classification accuracy of AdaBoost algorithm effectively.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号