Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

机译：从更少的数据中学习：计算机视觉的统一数据子集选择和主动学习框架

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry. Their data curation poses the challenges of expensive human labeling, inadequate computing resources and larger experiment turn around times. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges. A special class of subset selection functions naturally model notions of diversity, coverage and representation and can be used to eliminate redundancy thus lending themselves well for training data subset selection. They can also help improve the efficiency of active learning in further reducing human labeling efforts by selecting a subset of the examples obtained using the conventional uncertainty sampling based techniques. In this work, we empirically demonstrate the effectiveness of two diversity models, namely the Facility-Location and Dispersion models for training-data subset selection and reducing labeling effort. We demonstrate this across the board for a variety of computer vision tasks including Gender Recognition, Face Recognition, Scene Recognition, Object Detection and Object Recognition. Our results show that diversity based subset selection done in the right way can increase the accuracy by upto 5 - 10% over existing baselines, particularly in settings in which less training data is available. This allows the training of complex machine learning models like Convolutional Neural Networks with much less training data and labeling costs while incurring minimal performance loss.

机译：基于监督机器学习的最新计算机视觉技术通常需要大量数据。他们的数据管理带来了昂贵的人工标签，计算资源不足以及较大的实验周转时间的挑战。已经提出了训练数据子集选择和主动学习技术作为应对这些挑战的可能解决方案。一类特殊的子集选择功能自然可以对多样性，覆盖范围和表示形式的概念进行建模，并且可以用来消除冗余，因此很适合用于训练数据子集选择。它们还可以通过选择使用基于常规不确定性采样的技术获得的示例子集来帮助提高主动学习的效率，从而进一步减少人工标注的工作量。在这项工作中，我们通过经验证明了两种多样性模型的有效性，即用于训练数据子集选择和减少标记工作的设施位置模型和分散模型。我们针对各种计算机视觉任务（包括性别识别，面部识别，场景识别，对象检测和对象识别）进行了全面演示。我们的结果表明，以正确的方式进行基于多样性的子集选择可以使准确性比现有基准提高多达5-10％，特别是在可获得较少训练数据的环境中。这样就可以用更少的训练数据和标签成本来训练复杂的机器学习模型（例如卷积神经网络），而又将性能损失降到最低。

著录项

来源
《IEEE Winter Conference on Applications of Computer Vision》|2019年|1289-1299|共11页
会议地点
作者
Vishal Kaushal; Rishabh Iyer; Suraj Kothawade; Rohan Mahadev; Khoshrav Doctor; Ganesh Ramakrishnan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Training; Computational modeling; Computer vision; Training data; Labeling; Data models;

机译：任务分析;培训;计算建模;计算机视觉;培训数据;标签;数据模型;

相似文献

外文文献
中文文献
专利

1. Data Driven Feature Selection for Machine Learning Algorithms in Computer Vision [J] . Zhang Fan, Li Wei, Zhang Yifan, Internet of Things Journal, IEEE . 2018,第6期

机译：计算机视觉中机器学习算法的数据驱动特征选择
2. Feature Selection with Annealing for Computer Vision and Big Data Learning [J] . Adrian Barbu, Yiyuan She, Liangjing Ding, IEEE Transactions on Pattern Analysis and Machine Intelligence . 2017,第2期

机译：通过退火进行特征选择以实现计算机视觉和大数据学习
3. Query-learning-based iterative feature-subset selection for learning from high-dimensional data sets [J] . Hiroshi Mamitsuka Knowledge and information systems . 2006,第1期

机译：从高维数据集学习的基于查询学习的迭代特征子集选择
4. Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision [C] . Vishal Kaushal, Rishabh Iyer, Suraj Kothawade, IEEE Winter Conference on Applications of Computer Vision . 2019

机译：从数据较少学习：计算机愿景的统一数据子集选择和主动学习框架
5. Balance Optimization Subset Selection: A framework for causal inference with observational data [D] . Sauppe, Jason James. 2015

机译：平衡优化子集选择：具有观察数据的因果推断的框架
6. OmiEmbed: A Unified Multi-Task Deep Learning Framework for Multi-Omics Data [O] . Xiaoyu Zhang, Yuting Xing, Kai Sun, 2021

机译：OMIEMBED：多OMICS数据的统一多任务深度学习框架
7. Feature Selection with Annealing for Computer Vision and Big Data Learning [O] . Barbu, Adrian, She, Yiyuan, Ding, Liangjing, 2016

机译：用于计算机视觉和大数据退火的特征选择学习

Learning From Less Data: A Unified Data Subset Selection and Active Learning Framework for Computer Vision

摘要

著录项

相似文献

相关主题

期刊订阅