Active learning with confidence-based answers for crowdsourcing labeling tasks

Song Jinhua; Wang Hao; Gao Yang; An Bo

首页> 外文期刊>Knowledge-Based Systems >Active learning with confidence-based answers for crowdsourcing labeling tasks

【24h】

Active learning with confidence-based answers for crowdsourcing labeling tasks

机译：主动学习，以基于信心的答案进行众包标签任务

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Collecting labels for data is important for many practical applications (e.g., data mining). However, this process can be expensive and time-consuming since it needs extensive efforts of domain experts. To decrease the cost, many recent works combine crowdsourcing, which outsources labeling tasks (usually in the form of questions) to a large group of non-expert workers, and active learning, which actively selects the best instances to be labeled, to acquire labeled datasets. However, for difficult tasks where workers are uncertain about their answers, asking for discrete labels might lead to poor performance due to the low-quality labels. In this paper, we design questions to get continuous worker responses which are more informative and contain workers' labels as well as their confidence. As crowd workers may make mistakes, multiple workers are hired to answer each question. Then, we propose a new aggregation method to integrate the responses. By considering workers' confidence information, the accuracy of integrated labels is improved. Furthermore, based on the new answers, we propose a novel active learning framework to iteratively select instances for "labeling". We define a score function for instance selection by combining the uncertainty derived from the classifier model and the uncertainty derived from the answer sets. The uncertainty derived from uncertain answers is more effective than that derived from labels. We also propose batch methods which select multiple instances at a time to further improve the efficiency of our approach. Experimental studies on both simulated and real data show that our methods are effective in increasing the labeling accuracy and achieve significantly better performance than existing methods.

机译：收集数据标签对于许多实际应用（例如数据挖掘）很重要。但是，由于需要领域专家的大量努力，因此此过程可能既昂贵又耗时。为了降低成本，最近的许多工作都结合了众包（将标签任务（通常以问题的形式）外包给一大批非专家工人）和主动学习（主动选择要标记的最佳实例）来获得标记。数据集。但是，对于工人不确定答案的艰巨任务，由于标签质量低下，要求使用离散标签可能会导致性能不佳。在本文中，我们设计问题以使工人能够获得连续的响应，这些响应会提供更多信息，并包含工人的标签及其信心。由于人群工人可能会犯错误，因此雇用了多名工人来回答每个问题。然后，我们提出了一种新的聚合方法来整合响应。通过考虑工人的信任度信息，可以提高集成标签的准确性。此外，基于新的答案，我们提出了一种新颖的主动学习框架，以迭代方式选择用于“标记”的实例。我们结合分类器模型的不确定性和答案集的不确定性，为实例选择定义了得分函数。来自不确定答案的不确定性比来自标签的不确定性更有效。我们还提出了一次选择多个实例的批处理方法，以进一步提高我们的方法的效率。对模拟和真实数据的实验研究表明，与现有方法相比，我们的方法可有效提高标记的准确性，并显着提高性能。

著录项

来源
《Knowledge-Based Systems》 |2018年第1期|244-258|共15页
作者
Song Jinhua; Wang Hao; Gao Yang; An Bo;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Active cross-query learning: A reliable labeling mechanism via crowdsourcing for smart surveillance [J] . Computer Communications . 2020,第Feba期

机译：主动交叉查询学习：通过众包的可靠标记机制可进行智能监控
2. Agreeing to disagree: active learning with noisy labels without crowdsourcing [J] . Bouguelia Mohamed-Rafik, Nowaczyk Slawomir, Santosh K. C., International journal of machine learning and cybernetics . 2018,第8期

机译：同意不同意：带有嘈杂标签的主动学习，无需众包
3. Active Learning and Crowdsourcing: A Survey of Optimization Methods for Data Labeling [J] . Gilyazev R. A., Turdakov D. Yu. Programming and Computer Software . 2018,第6期

机译：主动学习和众包：数据标签优化方法的调查
4. Crowdsourcing Tasks in Open Query Answering [C] . Elena Simperl, Barry Norton, Denny Vrandecic AAAI Symposium . 2012

机译：开放查询应答中的众包任务
5. Efficient Learning-Based Recommendation Algorithms for Top- N Tasks and Top-N Workers in Large-Scale Crowdsourcing Systems [D] . Safran, Mejdl. 2018

机译：大规模众包系统中前N个任务和前N个工作者的基于学习的高效推荐算法
6. Confidence-based integrated reweighting model of task-difficulty explains location-based specificity in perceptual learning [O] . Bharath Chandra Talluri, Shao-Chin Hung, Aaron R. Seitz, -1

机译：基于信心的任务难度综合加权模型解释了感知学习中基于位置的特异性
7. Agreeing to disagree : active learning with noisy labels without crowdsourcing [O] . Bouguelia, Mohamed-Rafik, Nowaczyk, Sławomir, Santosh, K. C., 2017

机译：同意不同意：主动学习，带有嘈杂标签，无需众包

Active learning with confidence-based answers for crowdsourcing labeling tasks

摘要

著录项

相似文献

相关主题

期刊订阅