首页> 外文会议>Asian conference on computer vision >Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets
【24h】

Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets

机译:预算中的半监督学习:扩展到大型数据集

获取原文

摘要

Internet data sources provide us with large image datasets which are mostly without any explicit labeling. This setting is ideal for semi-supervised learning which seeks to exploit labeled data as well as a large pool of unlabeled data points to improve learning and classification. While we have made considerable progress on the theory and algorithms, we have seen limited success to translate such progress to the large scale datasets which these methods are inspired by. We investigate the computational complexity of popular graph-based semi-supervised learning algorithms together with different possible speed-ups. Our findings lead to a new algorithm that scales up to 40 times larger datasets in comparison to previous approaches and even increases the classification performance. Our method is based on the key insights that by employing a density-based measure unlabeled data points can be selected similar to an active learning scheme. This leads to a compact graph resulting in an improved performance up to 11.6% at reduced computational costs.
机译:Internet数据源为我们提供大量图像数据集,这些数据集主要是没有任何显式标记。此设置是半监督学习的理想选择,该学习旨在利用标记的数据以及大量的未标记数据点来改善学习和分类。虽然我们对理论和算法进行了相当大的进展,但我们已经看到了有限的成功,以翻译这些方法的大规模数据集的启发。我们调查基于流行的基于图形的半监督学习算法的计算复杂性以及不同的速度UPS。我们的调查结果导致了一种新的算法,与之前的方法相比,数据集最多可缩放40倍,甚至增加了分类性能。我们的方法基于密钥见解,即通过采用基于密度的测量,可以选择类似于主动学习方案的数据点。这导致紧凑的图表,在降低的计算成本下提高了11.6%的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号