A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

机译：仔细看看蒙面语言模型的语言知识：美国英语中相对条款的情况

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on. We evaluate three models (BERT, RoBERTa, and ALBERT), testing their grammatical and semantic knowledge by sentence-level probing, diagnostic cases, and masked prediction tasks. We focus on relative clauses (in American English) as a complex phenomenon needing contextual information and antecedent identification to be resolved. Based on a naturalistic dataset, probing shows that all three models indeed capture linguistic knowledge about grammaticality, achieving high performance. Evaluation on diagnostic cases and masked prediction tasks considering fine-grained linguistic knowledge, however, shows pronounced model-specific weaknesses especially on semantic knowledge, strongly impacting models' performance. Our results highlight the importance of (a) model comparison in evaluation task and (b) building up claims of model performance and the linguistic knowledge they capture beyond purely probing-based evaluations.

机译：基于变压器的语言模型在各种任务上实现了高性能，但我们仍然缺乏对他们学习和依赖的语言知识的那种。我们评估三种模型（BERT，Roberta和Albert），通过句子级探测，诊断情况和屏蔽预测任务测试他们的语法和语义知识。我们专注于相对条款（美国英语）作为需要解决上下文信息和先行识别的复杂现象。基于自然主义的数据集，探测表明，所有三种模型确实捕获了语言性的语言知识，实现了高性能。然而，考虑细粒度语言知识的诊断情况和屏蔽预测任务的评估显示明显的模型特异性弱点，特别是在语义知识，强烈影响模型的性能。我们的结果突出了（a）模型比较在评估任务中的重要性和（b）建立了模型性能的索赔以及它们捕获超出纯粹探测的评估的语言知识。

著录项

来源
《International Conference on Computational Linguistics》|2020年|771-787|共17页
会议地点
作者
Marius Mosbach; Stefania Degaetano-Ortlieb; Marie-Pauline Krielke; Badr M. Abdullah; Dietrich Klakow;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Cross-Linguistic Differences in Processing Double-Embedded Relative Clauses: Working-Memory Constraints or Language Statistics? [J] . Frank Stefan L., Trompenaars Thijs, Vasishth Shravan Cognitive science . 2016,第3期

机译：处理双嵌入相对子句时的跨语言差异：工作记忆约束还是语言统计？
2. English-Speaking Children's Comprehension of Relative Clauses: Evidence for General-Cognitive and Language-Specific Constraints on Development [J] . Evan Kidd, Edith L. Bavin Journal of psycholinguistic research . 2002,第6期

机译：说英语的儿童对相对从句的理解：发展的一般认知和特定语言限制的证据
3. English-Speaking Children's Comprehension of Relative Clauses: Evidence for General-Cognitive and Language-Specific Constraints on Development [J] . Evan Kidd, Edith L. Bavin Journal of psycholinguistic research . 2002,第6期

机译：说英语的儿童对相对从句的理解：发展的一般认知和特定语言限制的证据
4. From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding [C] . Rob van der Goot, Ibrahim Sharaf, Aizhan Imankulova, Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2021

机译：从蒙版语言建模到翻译：非英语辅助任务提高零射门语言理解
5. Working memory and the acquisition of English relative clauses in Japanese, Chinese and Korean learners of English as a second language. [D] . Romeo, Kenneth Robert. 2006

机译：日文，中文和韩文学习者将英语作为第二语言的工作记忆和英语相关从句的习得。
6. Judgements about double-embedded relative clauses differ between languages [O] . Stefan L. Frank, Patty Ernst -1

机译：不同语言对双嵌入相对从句的判断不同
7. A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English [O] . Marius Mosbach, Stefania Degaetano-Ortlieb, Marie-Pauline Krielke, 2020

机译：仔细观察屏蔽语言模型的语言知识：美国英语中相对条款的情况

A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

摘要

著录项

相似文献

相关主题

期刊订阅