首页> 外文会议>Conference on Empirical Methods in Natural Language Processing >SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
【24h】

SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup

机译:SEQMIX:通过序列混合增强主动序列标记

获取原文

摘要

Active learning is an important technique for low-resource sequence labeling tasks. However, current active sequence labeling methods use the queried samples alone in each iteration, which is an inefficient way of leveraging human annotations. We propose a simple but effective data augmentation method to improve label efficiency of active sequence labeling. Our method, SeqMix, simply augments the queried samples by generating extra labeled sequences in each iteration. The key difficulty is to generate plausible sequences along with token-level labels. In SeqMix, we address this challenge by performing mixup for both sequences and token-level labels of the queried samples. Furthermore, we design a discriminator during sequence mixup, which judges whether the generated sequences are plausible or not. Our experiments on Named Entity Recognition and Event Detection tasks show that SeqMix can improve the standard active sequence labeling method by 2.27%-3.75% in terms of E_1 scores.
机译:主动学习是低资源序列标签任务的重要技术。然而,当前的有源序列标记方法在每次迭代中使用查询样本,这是利用人类注释的低效方法。我们提出了一种简单但有效的数据增强方法,提高了有源序列标记的标签效率。我们的方法SEQMIX只是通过在每次迭代中生成额外标记的序列来增强查询样本。关键难度是产生合理的序列以及令牌级标签。在SEQMIX中,我们通过对查询样本的序列和令牌级标签进行混合来解决这一挑战。此外,我们在序列混合过程中设计鉴别器,这判断所生成的序列是否是合理的。我们关于命名实体识别和事件检测任务的实验表明,SEQMIX可以在E_1分数方面将标准有源序列标记方法提高2.27%-3.75%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号