首页> 外文会议>Conference on empirical methods in natural language processing >Orthogonal Matching Pursuit for Text Classification
【24h】

Orthogonal Matching Pursuit for Text Classification

机译:正交匹配文本分类的追求

获取原文

摘要

In text classification, the problem of overfitting arises due to the high dimensionality, making regularization essential. Although classic regularizers provide sparsity, they fail to return highly accurate models. On the contrary, state-of-the-art group-lasso regularizers provide better results at the expense of low sparsity. In this paper, we apply a greedy variable selection algorithm, called Orthogonal Matching Pursuit, for the text classification task. We also extend standard group OMP by introducing overlapping Group OMP to handle overlapping groups of features. Empirical analysis verifies that both OMP and overlapping GOMP constitute powerful regularizers, able to produce effective and very sparse models. Code and data are available online.
机译:在文本分类中,由于高维度,过度装箱的问题出现,使正则化必不可少。虽然经典普通方案提供稀疏性,但它们无法返回高度准确的模型。相反,最先进的Group-Lasso常规程序以低稀疏性提供更好的结果。在本文中,我们应用了一种贪婪的变量选择算法,称为正交匹配追求,用于文本分类任务。我们还通过引入重叠组OMP来处理重叠的功能组的标准组OMP。实证分析验证了OMP和重叠的GOMP都构成了强大的普通方案,能够产生有效且非常稀疏的模型。代码和数据在线获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号