首页> 外文会议> >Voting for POS Tagging of Latin Texts: Using the Flair of FLAIR to Better Ensemble Classifiers by Example of Latin
【24h】

Voting for POS Tagging of Latin Texts: Using the Flair of FLAIR to Better Ensemble Classifiers by Example of Latin

机译:投票支持POS标记拉丁文本:使用FLAIR的标签,以拉丁词为例更好地整合分类器

获取原文

摘要

Despite the great importance of the Latin language in the past, there are relatively few resources available today to develop modern NLP tools for this language. Therefore, the F.vaLatin Shared Task for Lcmmatization and Part-of-Speech (POS) tagging was published in the LT4HALA workshop. In our work, we dealt with the second EvaLatin task, that is, POS tagging. Since most of the available Latin word embeddings were trained on either few or inaccurate data, we trained several embeddings on better data in the first step. Based on these embeddings, we trained several state-of-the-art taggers and used them as input for an ensemble classifier called LSTMVoter. We were able to achieve the best results for both the cross-genre and the cross-time task (90,64 % and 87,00 %) without using additional annotated data (closed modality). In the meantime, we further improved the system and achieved even better results (96,91 % on classical, 90,87 % on cross-genre and 87,35 % on cross-time).
机译:尽管过去拉丁语非常重要,但如今相对而言,可用于开发该语言的现代NLP工具的资源相对较少。因此,用于Lmmmatization和词性(POS)标记的F.vaLatin共享任务已在LT4HALA研讨会上发布。在我们的工作中,我们处理了第二个EvaLatin任务,即POS标记。由于大多数可用的拉丁词嵌入都是在很少的数据或不准确的数据上进行训练的,因此我们首先在更好的数据上对一些嵌入进行了训练。基于这些嵌入,我们训练了几个最新的标记器,并将它们用作名为LSTMVoter的整体分类器的输入。我们能够在不使用附加注释数据(封闭模态)的情况下,为跨类型和跨时间任务(90.64%和87,00%)获得最佳结果。同时,我们进一步改进了系统,并取得了更好的结果(古典音乐占96.91%,跨类型音乐占90.87%,跨时间音乐占87.35%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号