【24h】

Genres in the Prague Discourse Treebank

机译:在布拉格话语TreeBank的流派

获取原文

摘要

We present the project of classification of Prague Discourse Treebank documents (Czech journalistic texts) for their genres. Our main interest lies in opening the possibility to observe how text coherence is realized in different types (in the genre sense) of language data and, in the future, in exploring the ways of using genres as a feature for multi-sentence-level language technologies. In the paper, we first describe the motivation and the concept of the genre annotation, and briefly introduce the Prague Discourse Treebank. Then, we elaborate on the process of manual annotation of genres in the treebank, from the annotators' manual work to post-annotation checks and to the inter-annotator agreement measurements. The annotated genres are subsequently analyzed together with discourse relations (already annotated in the treebank) - we present distributions of the annotated genres and results of studying distinctions of distributions of discourse relations across the individual genres.
机译:我们介绍了Prague话语TreeBank文件(捷克新闻文本)的分类项目。我们的主要利益在于打开观察语言数据的不同类型(在类型的类型)和未来如何实现文本一致性的可能性,并在探索使用流派作为多句子级语言的功能的方式技术。在论文中,我们首先描述了流派注释的动机和概念,并简要介绍了布拉格话语树木银行。然后,我们详细阐述了TreeBank中的流派的手动注释,从注释器的手动工作到涂布后检查以及互联网间协议测量。随后将带注释的类型与话语关系一起分析(已经在TreeBank中注释) - 我们呈现有助于流派的分布和研究各种类型的话语关系分布的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号