Extrinsic Corpus Evaluation with a Collocation Dictionary Task

机译：与搭配字典任务的外在语料库评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The NLP researcher or application-builder often wonders "what corpus should I use, or should I build one of my own? If I build one of my own, how will I know if I have done a good job?" Currently there is very little help available for them. They are in need of a framework for evaluating corpora. We develop such a framework, in relation to corpora which aim for good coverage of 'general language'. The task we set is automatic creation of a publication-quality collocations dictionary. For a sample of 100 headwords of Czech and 100 of English, we identify a gold standard dataset of (ideally) all the collocations that should appear for these headwords in such a dictionary. The datasets are being made available alongside this paper. We then use them to determine precision and recall for a range of corpora, with a range of parameters.

机译：NLP研究人员或应用程序 - 建设者经常出现奇迹“我应该使用什么语料库，或者我应该建立我自己的一个？如果我建造我自己的一个，我将如何知道我做得很好吗？”目前对他们来说有很少的帮助。他们需要一个评估Corpora的框架。我们培养了这样一个框架，与Corpora有关，旨在良好地覆盖“普通语言”。我们设置的任务是自动创建出版物质量搭配字典。对于捷克和100英文的100个字词的样本，我们识别（理想情况下）所有展示的金标准数据集，这些展示在这样的字典中应该出现在这样的字典中。数据集正在与本文一起提供。然后，我们使用它们来确定一系列语料库的精度并回忆，具有一系列参数。

著录项

来源
《9th International conference on language resources and evaluation》|2014年||共8页
会议地点
作者
Adam Kilgarriff; Milos Jakubicek; Vojtech Kovar; Pavel Rychly; Vit Baisa; Lucia Kocincova;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词
corpus; evaluation; collocation;

机译：语料库;评估;搭配;

相似文献

外文文献
中文文献
专利

1. Collocation Dictionaries as Inductive Learning Resources in Data-Driven Learning – An Analysis and Evaluation [J] . Iain McGee International Journal of Lexicography . 2012,第3期

机译：搭配词典作为数据驱动学习中的归纳学习资源–分析与评估
2. The Use of Corpus and Frame Semantics in a Lexicography Class: Evaluating Dictionary Entries [J] . Intan Safinaz Zainudin, Nor Hashimah Jalaluddin, Khairul Taufiq Abu Bakar Procedia - Social and Behavioral Sciences . 2014,第2期

机译：词典分类中语料库和框架语义的使用：评估字典条目
3. Evaluation of the SHAPD2 Algorithm Efficiency in Plagiarism Detection Task Using PAN Plagiarism Corpus [J] . Dariusz Ceglarek Computer Science & Information Technology . 2013,第3期

机译：用PAN抄袭语料库评估抄袭检测任务的ShapD2算法效率
4. Extrinsic Corpus Evaluation with a Collocation Dictionary Task [C] . Adam Kilgarriff, Milos Jakubicek, Vojtech Kovar, 9th International conference on language resources and evaluation . 2014

机译：外在语料库评估与搭配词典任务
5. Collocational features of China English: A corpus-based contrastive study. [D] . Liang, Jianli. 2015

机译：中国英语的搭配特征：基于语料库的对比研究。
6. Dynamic up- and down-regulation of the default (DMN) and extrinsic (EMN) mode networks during alternating task-on and task-off periods [O] . Kenneth Hugdahl, Katarzyna Kazimierczak, Justyna Beresniewicz, 2012

机译：在任务开启和关闭期间交替进行默认（DMN）和外部（EMN）模式网络的动态上调和下调
7. Is Bigger Better? Corpus and Dictionary Use in the Search for Compounds, Collocations, Derived Forms and Fixed Expressions [O] . Royle Phaedra, Richardson Isabelle, Boisvert Sophie, 2009

机译：更好吗？语料库和词典用于搜索化合物，搭配，派生形式和固定表达式

Extrinsic Corpus Evaluation with a Collocation Dictionary Task

摘要

著录项

相似文献

相关主题

期刊订阅