首页> 外文学位 >Simultaneous Variable and Feature Group Selection in Heterogeneous Learning: Optimization and Applications.
【24h】

Simultaneous Variable and Feature Group Selection in Heterogeneous Learning: Optimization and Applications.

机译:异构学习中同时变量和特征组的选择:优化和应用。

获取原文
获取原文并翻译 | 示例

摘要

Advances in data collection technologies have made it cost-effective to obtain heterogeneous data from multiple data sources. Very often, the data are of very high dimension and feature selection is preferred in order to reduce noise, save computational cost and learn interpretable models. Due to the multi-modality nature of heterogeneous data, it is interesting to design efficient machine learning models that are capable of performing variable selection and feature group (data source) selection simultaneously (a.k.a bi-level selection). In this thesis, I carry out research along this direction with a particular focus on designing efficient optimization algorithms. I start with a unified bi-level learning model that contains several existing feature selection models as special cases. Then the proposed model is further extended to tackle the block-wise missing data, one of the major challenges in the diagnosis of Alzheimer's Disease (AD). Moreover, I propose a novel interpretable sparse group feature selection model that greatly facilitates the procedure of parameter tuning and model selection. Last but not least, I show that by solving the sparse group hard thresholding problem directly, the sparse group feature selection model can be further improved in terms of both algorithmic complexity and efficiency. Promising results are demonstrated in the extensive evaluation on multiple real-world data sets.
机译:数据收集技术的进步使得从多个数据源获取异构数据具有成本效益。通常,数据具有很高的维数,并且最好选择特征以减少噪声,节省计算成本并学习可解释的模型。由于异构数据的多模态性质,设计一种高效的机器学习模型非常有趣,该模型能够同时执行变量选择和特征组(数据源)选择(又称为双层选择)。在这篇论文中,我沿着这个方向进行了研究,特别着重于设计有效的优化算法。我从一个统一的双层学习模型开始,该模型包含一些现有的特征选择模型(作为特例)。然后,将所提出的模型进一步扩展以解决逐块丢失的数据,这是诊断阿尔茨海默氏病(AD)的主要挑战之一。此外,我提出了一种新颖的可解释的稀疏群特征选择模型,该模型极大地简化了参数调整和模型选择的过程。最后但并非最不重要的一点是,我表明,通过直接解决稀疏组硬阈值问题,可以在算法复杂性和效率方面进一步改进稀疏组特征选择模型。在对多个真实世界数据集的广泛评估中证明了有希望的结果。

著录项

  • 作者

    Xiang, Shuo.;

  • 作者单位

    Arizona State University.;

  • 授予单位 Arizona State University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号