Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets

机译：预算中的半监督学习：扩展到大型数据集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Internet data sources provide us with large image datasets which are mostly without any explicit labeling. This setting is ideal for semi-supervised learning which seeks to exploit labeled data as well as a large pool of unlabeled data points to improve learning and classification. While we have made considerable progress on the theory and algorithms, we have seen limited success to translate such progress to the large scale datasets which these methods are inspired by. We investigate the computational complexity of popular graph-based semi-supervised learning algorithms together with different possible speed-ups. Our findings lead to a new algorithm that scales up to 40 times larger datasets in comparison to previous approaches and even increases the classification performance. Our method is based on the key insights that by employing a density-based measure unlabeled data points can be selected similar to an active learning scheme. This leads to a compact graph resulting in an improved performance up to 11.6% at reduced computational costs.

机译：Internet数据源为我们提供大量图像数据集，这些数据集主要是没有任何显式标记。此设置是半监督学习的理想选择，该学习旨在利用标记的数据以及大量的未标记数据点来改善学习和分类。虽然我们对理论和算法进行了相当大的进展，但我们已经看到了有限的成功，以翻译这些方法的大规模数据集的启发。我们调查基于流行的基于图形的半监督学习算法的计算复杂性以及不同的速度UPS。我们的调查结果导致了一种新的算法，与之前的方法相比，数据集最多可缩放40倍，甚至增加了分类性能。我们的方法基于密钥见解，即通过采用基于密度的测量，可以选择类似于主动学习方案的数据点。这导致紧凑的图表，在降低的计算成本下提高了11.6％的性能。

著录项

来源
《Asian conference on computer vision》|2013年||共14页
会议地点
作者
Sandra Ebert; Mario Fritz; Bernt Schiele;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Graph-based semi-supervised learning with multi-modality propagation for large-scale image datasets [J] . Wen-Yu Lee, Liang-Chi Hsieh, Guan-Long Wu, Journal of visual communication & image representation . 2013,第3期

机译：基于图的半监督学习与多模式传播的大规模图像数据集
2. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets [J] . Ana Stanescu, Doina Caragea BMC Systems Biology . 2015,第SUPPLEMENTa5期

机译：基于集成的不平衡剪接位点数据集的半监督学习方法的实证研究
3. Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning [J] . Cole Harris, Noushin Ghaffari BMC Genomics . 2008,第SUPPLEMENTa2期

机译：使用半监督学习跨注释的和未注释的微阵列数据集发现生物标志物
4. Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets [C] . Sandra Ebert, Mario Fritz, Bernt Schiele Asian conference on computer vision . 2013

机译：预算的半监督学习：扩展到大数据集
5. Unsupervised Binary Code Learning for Approximate Nearest Neighbor Search in Large-scale Datasets. [D] . Zhang, Hao. 2016

机译：大规模数据集中近似邻居搜索的无监督二进制代码学习。
6. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets [O] . Ana Stanescu, Doina Caragea 2015

机译：基于整体的不平衡拼接位点数据集半监督学习方法的实证研究
7. Semi-Supervised Learning on a Budget: Scaling up to Large Datasets [O] . Ra Ebert, Mario Fritz, Bernt Schiele 2013

机译：预算中的半监督学习：扩展到大型数据集

Semi-Supervised Learning on a Budget: Scaling Up to Large Datasets

摘要

著录项

相似文献

相关主题

期刊订阅