UD-Japanese BCCWJ: Universal Dependencies Annotation for the Balanced Corpus of Contemporary Written Japanese

机译：ud-japanes bccwj：普遍依赖性批注当代人写日本的平衡语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we describe a corpus UD Japanese-BCCWJ that was created by converting the Balanced Corpus of Contemporary Written Japanese (BCCWJ), a Japanese language corpus, to adhere to the UD annotation schema. The BCCWJ already assigns dependency information at the level of the bun-setsu (a Japanese syntactic unit comparable to the phrase). We developed a program to convert the BCCWJto UD based on this dependency structure, and this corpus is the result of completely automatic conversion using the program. UD Japanese-BCCWJ is the largest-scale UD Japanese corpus and the second-largest of all UD corpora, including 1,980 documents, 57,109 sentences, and 1,273k words across six distinct domains.

机译：在本文中，我们描述了通过转换当代书面日语（BCCWJ），日语语料库的平衡语料库来创建的语料库UD日本BCCWJ，以遵守UD注释模式。 BCCWJ已经在Bun-SetSU的级别（日语句法单元相当）的级别分配依赖性信息。我们开发了一个程序，用于基于此依赖结构转换BCCWJTO UD，而该语料库是使用该程序完全自动转换的结果。 UD日本BCCWJ是最大的UD日语语料库和所有UD基础的第二大公司，包括1,980个文件，57,109个句子和跨六个不同域名的1,273k字。

著录项

来源
《Conference on empirical methods in natural language processing》|2018年|xiv 201 p.|共9页
会议地点
作者
Mai Omura; Masayuki Asahara;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. Balanced corpus of contemporary written Japanese [J] . Kikuo Maekawa, Makoto Yamazaki, Toshinobu Ogiso, Language Resources and Evaluation . 2014,第2期

机译：现代日语的平衡语料库
2. Factor Analysis of Utterances in Japanese Fiction-Writing Based on BCCWJ Speaker Information Corpus [J] . Hajime Murai Advances in human-computer interaction . 2018,第7期

机译：基于BCCWJ演讲者信息语料库的日本小说写作话语因素分析
3. Factor Analysis of Utterances in Japanese Fiction-Writing Based on BCCWJ Speaker Information Corpus [J] . Hajime Murai Advances in human-computer interaction . 2018,第期

机译：基于BCCWJ演讲者信息语料库的日语小说写作话语因素分析
4. UD-Japanese BCCWJ: Universal Dependencies Annotation for the Balanced Corpus of Contemporary Written Japanese [C] . Mai Omura, Masayuki Asahara Second workshop on universal dependencies . 2018

机译：UD-日语BCCWJ：当代书面日语平衡语料库的通用依赖注释
5. UD-Japanese BCCWJ: Universal Dependencies Annotation for the Balanced Corpus of Contemporary Written Japanese [O] . Mai Omura, Masayuki Asahara 2018

机译：ud-japanes bccwj：普遍依赖性批注当代人写日本的平衡语料库

UD-Japanese BCCWJ: Universal Dependencies Annotation for the Balanced Corpus of Contemporary Written Japanese

摘要

著录项

相似文献

相关主题

期刊订阅