首页> 外文期刊>Journal of Computer Science & Technology >Scaling Conditional Random Fields by One-Against-the-Other Decomposition
【24h】

Scaling Conditional Random Fields by One-Against-the-Other Decomposition

机译:通过反对另一种分解来缩放条件随机场

获取原文
获取原文并翻译 | 示例
           

摘要

As a powerful sequence labeling model, conditional random fields (CRFs) have had successful applications in many natural language processing (NLP) tasks. However, the high complexity of CRFs training only allows a very small tag (or label) set, because the training becomes intractable as the tag set enlarges. This paper proposes an improved decomposed training and joint decoding algorithm for CRF learning. Instead of training a single CRF model for all tags, it trains a binary sub-CRF independently for each tag. An optimal tag sequence is then produced by a joint decoding algorithm based on the probabilistic output of all sub-CRFs involved. To test its effectiveness, we apply this approach to tackling Chinese word segmentation (CWS) as a sequence labeling problem. Our evaluation shows that it can reduce the computational cost of this language processing task by 40-50% without any significant performance loss on various large-scale data sets.
机译:作为强大的序列标记模型,条件随机字段(CRF)已在许多自然语言处理(NLP)任务中成功应用。但是,CRF训练的高复杂度仅允许使用非常小的标签(或标签)集,因为随着标签集的扩大,训练变得很棘手。提出了一种改进的CRF学习分解训练和联合解码算法。它没有为所有标签训练单个CRF模型,而是为每个标签独立训练了一个二进制sub-CRF。然后,基于所涉及的所有子CRF的概率输出,由联合解码算法生成最佳标签序列。为了测试其有效性,我们将这种方法应用于解决中文分词(CWS)作为序列标签问题。我们的评估表明,它可以将这种语言处理任务的计算成本降低40-50%,而不会在各种大型数据集上造成任何明显的性能损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号