首页> 外文会议>Conference on empirical methods in natural language processing >Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages
【24h】

Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages

机译:过去,现在,未来:1000种语言时态类型的计算研究

获取原文

摘要

We present SuperPivot, an analysis method for low-resource languages that occur in a superparallel corpus, i e., in a corpus that contains an order of magnitude more languages than parallel corpora currently in use. We show that SuperPivot performs well for the crosslingual analysis of the linguistic phenomenon of tense. We produce analysis results for more than 1000 languages, conducting - to the best of our knowledge - the largest crosslingual computational study performed to date. We extend existing methodology for leveraging parallel corpora for typological analysis by overcoming a limiting assumption of earlier work: We only require that a linguistic feature is overtly marked in a few of thousands of languages as opposed to requiring that it be marked in all languages under investigation.
机译:我们提出了SuperPivot,这是一种针对出现在超并行语料库中的低资源语言的分析方法,即比当前使用的并行语料库包含一个数量级以上语言的语料库。我们表明,SuperPivot在对时态的语言现象进行跨语言分析方面表现良好。我们提供了超过1000种语言的分析结果,并据我们所知进行了迄今为止规模最大的跨语言计算研究。通过克服对早期工作的局限性假设,我们扩展了利用并行语料库进行类型学分析的现有方法:我们只要求在数千种语言中公开标记语言功能,而不是要求在所有正在研究的语言中标记语言功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号