首页> 外文会议>Conference on Neural Information Processing Systems >Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces
【24h】

Breaking the Glass Ceiling for Embedding-Based Classifiers for Large Output Spaces

机译:打破基于嵌入的大型输出空间的玻璃天花板

获取原文

摘要

In extreme classification settings, embedding-based neural network models are currently not competitive with sparse linear and tree-based methods in terms of accuracy. Most prior works attribute this poor performance to the low-dimensional bottleneck in embedding-based methods. In this paper, we demonstrate that theoretically there is no limitation to using low-dimensional embedding-based methods, and provide experimental evidence that overfitting is the root cause of the poor performance of embedding-based methods. These findings motivate us to investigate novel data augmentation and regularization techniques to mitigate overfitting. To this end, we propose GLaS, a new regularizer for embedding-based neural network approaches. It is a natural generalization from the graph Laplacian and spread-out regularizers, and empirically it addresses the drawback of each regularizer alone when applied to the extreme classification setup. With the proposed techniques, we attain or improve upon the state-of-the-art on most widely tested public extreme classification datasets with hundreds of thousands of labels.
机译:在极端分类设置中,基于嵌入的神经网络模型目前不具有稀疏线性和基于树的方法在准确性方面的竞争。大多数先前的作品将这种性能归因于基于嵌入的方法的低维瓶颈。在本文中,我们证明理论上没有限制基于低维嵌入的方法,并提供了过度装备的实验证据是基于嵌入的方法性能不佳的根本原因。这些调查结果激励我们调查新颖的数据增强和正规化技术来缓解过度拟合。为此,我们向Glas提出了一种用于基于嵌入的神经网络方法的新规范器。它是图表拉普拉斯和散布常规方的自然概括,并经验地解决了应用于极端分类设置时单独解决每个常规器的缺点。通过拟议的技术,我们对最先进的公共极端分类数据集获得或改进现有最新的公共极端分类数据集,其中包含数以万计的标签。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号