您现在的位置: 首页> 研究主题> Bagging

Bagging

Bagging的相关文献在2003年到2023年内共计227篇,主要集中在自动化技术、计算机技术、经济计划与管理、无线电电子学、电信技术 等领域,其中期刊论文136篇、会议论文2篇、专利文献89篇;相关期刊105种,包括科学技术与工程、计算机工程、计算机工程与科学等; 相关会议2种,包括第二十五届中国数据库学术会议(NDBC2008)、全国第15届计算机辅助设计与图形学学术会议等;Bagging的相关文献由678位作者贡献,包括玛丽亚·卡特里纳·图尔科、杨新武、王宇飞等。

Bagging—发文量

期刊论文>

论文:136 占比:59.91%

会议论文>

论文:2 占比:0.88%

专利文献>

论文:89 占比:39.21%

总计:227篇

Bagging—发文趋势图

Bagging

-研究学者

  • 玛丽亚·卡特里纳·图尔科
  • 杨新武
  • 王宇飞
  • 翟飞
  • 亚历山德拉·罗萨蒂
  • 刘汉明
  • 刘赵发
  • 周钢
  • 毛力
  • 温琴佐·德劳伦兹
  • 期刊论文
  • 会议论文
  • 专利文献

搜索

排序:

年份

    • Yang Yang; Pengfei Zheng; Fanru Zeng; Peng Xin; GuoxiHe; Kexi Liao
    • 摘要: Accurate prediction of the internal corrosion rates of oil and gas pipelines could be an effective way to prevent pipeline leaks.In this study,a proposed framework for predicting corrosion rates under a small sample of metal corrosion data in the laboratory was developed to provide a new perspective on how to solve the problem of pipeline corrosion under the condition of insufficient real samples.This approach employed the bagging algorithm to construct a strong learner by integrating several KNN learners.A total of 99 data were collected and split into training and test set with a 9:1 ratio.The training set was used to obtain the best hyperparameters by 10-fold cross-validation and grid search,and the test set was used to determine the performance of the model.The results showed that theMean Absolute Error(MAE)of this framework is 28.06%of the traditional model and outperforms other ensemblemethods.Therefore,the proposed framework is suitable formetal corrosion prediction under small sample conditions.
    • Quang-Hieu Tran; Hoang Nguyen; Xuan-Nam Bui
    • 摘要: This study considered and predicted blast-induced ground vibration(PPV)in open-pit mines using bagging and sibling techniques under the rigorous combination of machine learning algorithms.Accordingly,four machine learning algorithms,including support vector regression(SVR),extra trees(ExTree),K-nearest neighbors(KNN),and decision tree regression(DTR),were used as the base models for the purposes of combination and PPV initial prediction.The bagging regressor(BA)was then applied to combine these base models with the efforts of variance reduction,overfitting elimination,and generating more robust predictive models,abbreviated as BA-ExTree,BAKNN,BA-SVR,and BA-DTR.It is emphasized that the ExTree model has not been considered for predicting blastinduced ground vibration before,and the bagging of ExTree is an innovation aiming to improve the accuracy of the inherently ExTree model,as well.In addition,two empirical models(i.e.,USBM and Ambraseys)were also treated and compared with the bagging models to gain a comprehensive assessment.With this aim,we collected 300 blasting events with different parameters at the Sin Quyen copper mine(Vietnam),and the produced PPV values were also measured.They were then compiled as the dataset to develop the PPV predictive models.The results revealed that the bagging models provided better performance than the empirical models,except for the BA-DTR model.Of those,the BA-ExTree is the best model with the highest accuracy(i.e.,88.8%).Whereas,the empirical models only provided the accuracy from 73.6%–76%.The details of comparisons and assessments were also presented in this study.
    • 郑金萍; 刘赵发; 胡珍珍; 李泽南; 黎姿; 刘汉明; 汪廷华; 胡声洲
    • 摘要: 随机森林是采用Bagging组合方法集成的决策树集合,在数据分类、预测领域应用广泛.Bagging组合方法在机器学习中具有代表性,但对于实际的大数据挖掘仍存在一些不足.mBagging是基于Bagging组合方法的一种改进,具有更高的统计功效、更低的假阳率以及更快的运算速度.采用全基因组SNP仿真数据集的实验表明,基于mBagging的随机森林运算速度明显快于传统的随机森林,且在保证OOB袋外错误率不劣化的前提下,判断风险SNP的准确率得到了提高.
    • 司晶硕
    • 摘要: 在传统的索赔额预测中,广义线性模型(GLM)是一种常用的方法。近年来,机器学习算法在该领域也取得了良好的效果,为索赔额预测提供了新的选择。在大数据时代,如何更准确地进行预测,是亟待解决的问题。为了解决该问题,本文利用两层Stacking模型,两种其他集成学习算法和广义线性模型对累积索赔额进行预测。通过比较各算法的均方根误差及平方绝对误差,可发现包括Stacking的集成算法精度全部优于传统广义线性模型。最后,本文利用累积索赔额建立了奖惩系统的转移规则,将之与集成学习结合可以更合理地开发新的保险产品。
    • 赵存秀
    • 摘要: 本文基于两种集成结合策略方法投票和stacking法,这两种方法结合已有的模型(线性分类器,支持向量机,分类回归树)对不平衡数据处理,预测乳腺肿瘤是良性与恶性,实验结果发现随机森林的投票和stacking集成方法都能以较高的准确度识别出肿瘤良恶性,其中实验准确度达到97.41%和97.33%,从kappa值上可以看出,本次研究的一致性较强。
    • LEJUN GONG; SHEHAI ZHOU; JINGMEI CHEN; YONGMIN LI; LI ZHANG; ZHIHONG GAO
    • 摘要: Long non-coding RNAs(lncRNAs)play an important role in many life activities such as epigenetic material regulation,cell cycle regulation,dosage compensation and cell differentiation regulation,and are associated with many human diseases.There are many limitations in identifying and annotating lncRNAs using traditional biological experimental methods.With the development of high-throughput sequencing technology,it is of great practical significance to identify the lncRNAs from massive RNA sequence data using machine learning method.Based on the Bagging method and Decision Tree algorithm in ensemble learning,this paper proposes a method of lncRNAs gene sequence identification called BDLR.The identification results of this classification method are compared with the identification results of several models including Byes,Support Vector Machine,Logical Regression,Decision Tree and Random Forest.The experimental results show that the lncRNAs identification method named BDLR proposed in this paper has an accuracy of 86.61%in the human test set and 90.34%in the mouse for lncRNAs,which is more than the identification results of the other methods.Moreover,the proposed method offers a reference for researchers to identify lncRNAs using the ensemble learning.
    • Moheb R.Girgis; Rofida M.Gamal; Enas Elgeldawi
    • 摘要: Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.
    • 杜嘉伟; 余粟
    • 摘要: 本文提出一种基于改进模糊FP-Growth的异常检测算法-RFPG算法(Random Frequency Pattern Growth),算法建立2层FP-Tree。第一层基于bagging思想,随机采样生成集合并得到长频繁项集合;第二层将长频繁项集合作为输入,得到模式强关联规则集,再通过相似度计算进行异常检测分类。实验结果显示,本文提出算法的整体异常检测效率与质量良好。
    • Naglaa F.El Abady; Mohamed Taha; Hala H.Zayed
    • 摘要: Because of the widespread availability of low-cost printers and scanners,document forgery has become extremely popular.Watermarks or signatures are used to protect important papers such as certificates,passports,and identification cards.Identifying the origins of printed documents is helpful for criminal investigations and also for authenticating digital versions of a document in today’s world.Source printer identification(SPI)has become increasingly popular for identifying frauds in printed documents.This paper provides a proposed algorithm for identifying the source printer and categorizing the questioned document into one of the printer classes.A dataset of 1200 papers from 20 distinct(13)laser and(7)inkjet printers achieved significant identification results.A proposed algorithm based on global features such as the Histogram of Oriented Gradient(HOG)and local features such as Local Binary Pattern(LBP)descriptors has been proposed for printer identification.For classification,Decision Trees(DT),k-Nearest Neighbors(k-NN),Random Forests,Aggregate bootstrapping(bagging),Adaptive-boosting(boosting),Support Vector Machine(SVM),and mixtures of these classifiers have been employed.The proposed algorithm can accurately classify the questioned documents into their appropriate printer classes.The adaptive boosting classifier attained a 96%accuracy.The proposed algorithm is compared to four recently published algorithms that used the same dataset and gives better classification accuracy.
    • Nguyen Thanh Hoan; Nguyen Van Dung; Ho Le Thu; Hoa Thuy Quynh; Nadhir Al-Ansari; Tran Van Phong; Phan Trong Trinh; Dam Duc Nguyen; Hiep Van Le; Hanh Bich Thi Nguyen; Mahdis Amiri; Indra Prakash; Binh Thai Pham
    • 摘要: Water level predictions in the river,lake and delta play an important role in flood management.Every year Mekong River delta of Vietnam is experiencing flood due to heavy monsoon rains and high tides.Land subsidence may also aggravate flooding problems in this area.Therefore,accurate predictions of water levels in this region are very important to forewarn the people and authorities for taking timely adequate remedial measures to prevent losses of life and property.There are so many methods available to predict the water levels based on historical data but nowadays Machine Learning(ML)methods are considered the best tool for accurate prediction.In this study,we have used surface water level data of 18 water level measurement stations of the Mekong River delta from 2000 to 2018 to build novel time-series Bagging based hybrid ML models namely:Bagging(RF),Bagging(SOM)and Bagging(M5P)to predict historical water levels in the study area.Performances of the Bagging-based hybrid models were compared with Reduced Error Pruning Trees(REPT),which is a benchmark ML model.The data of 19 years period was divided into 70:30 ratio for the modeling.The data of the period 1/2000 to 5/2013(which is about 70%of total data)was used for the training and for the period 5/2013 to 12/2018(which is about 30%of total data)was used for testing(validating)the models.Performance of the models was evaluated using standard statistical measures:Coefficient of Determination(R2),Root Mean Square Error(RMSE)and Mean Absolute Error(MAE).Results show that the performance of all the developed models is good(R2>0.9)for the prediction of water levels in the study area.However,the Bagging-based hybrid models are slightly better than another model such as REPT.Thus,these Bagging-based hybrid time series models can be used for predicting water levels at Mekong data.
  • 查看更多

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号