An Annotated Corpus of Emerging Anglicisms in Spanish Newspaper Headlines

机译：西班牙报纸头条上的新兴盎格鲁主义注解语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The extraction of anglicisms (lexical borrowings from English) is relevant both for lexicographic purposes and for NLP downstream tasks. We introduce a corpus of European Spanish newspaper headlines annotated with anglicisms and a baseline model for anglicism extraction. In this paper we present: (1) a corpus of 21,570 newspaper headlines written in European Spanish annotated with emergent anglicisms and (2) a conditional random field baseline model with handcrafted features for anglicism extraction. We present the newspaper headlines corpus, describe the annotation tagset and guidelines and introduce a CRF model that can serve as baseline for the task of detecting anglicisms. The presented work is a first step towards the creation of an anglicism extractor for Spanish ncwswire.

机译：语言提取（英语中的词汇借用）与词典目的和NLP下游任务都相关。我们介绍了欧洲英语报纸头条，上面标注了盎格鲁主义和盎格鲁主义提取的基线模型。在本文中，我们介绍：（1）用欧洲西班牙文写成的21,570个报纸头条的语料库，注有紧急英语，以及（2）具有手工特征的条件随机场基线模型，用于提取英语。我们介绍了报纸的头条语料库，描述了注释标签集和指南，并介绍了可作为基线的英语语言能力检测任务的CRF模型。呈现的作品是为西班牙ncwswire创建英语语言提取器的第一步。

著录项

来源
《Workshop on Computational Approaches to Code Switching》|2020年|1-8|共8页
会议地点
作者
Elena Alvarez-Mellado;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
borrowing extraction; anglicism; newspaper corpus;

机译：借款提取;英国主义报纸语料库;

相似文献

外文文献
中文文献
专利

1. Analysing headlines as a way of downsizing news corpora: Evidence from an Arabic-English comparable corpus of newspaper articles [J] . Haider Ahmad S., Hussein Riyad F. Literary & linguistic computing . 2020,第4期

机译：分析头条新闻作为缩小新闻学习的方式：来自阿拉伯语 - 英语的证据报纸文章
2. Spanish newspaper makes headlines with ISO 9001 [J] . Jesus F. Frogo Quality Control and Applied Statistics . 2009,第2期

机译：西班牙报纸成为ISO 9001的头条新闻
3. SFU Reviewsp-NEG: a Spanish corpus annotated with negation for sentiment analysis. A typology of negation patterns [J] . Maria Jimenez-Zafra Salud, Taule Mariona, Teresa Martin-Valdivia M., Language Resources and Evaluation . 2018,第2期

机译：SFU Reviewsp-NEG：西班牙语料库，带有否定注释，用于情感分析。否定模式的类型学
4. TweetNorm_es Corpus: an Annotated Corpus for Spanish Microtext Normalization [C] . Inaki Alegria, Nora Aranberri, Pere R. Comas, 9th International conference on language resources and evaluation . 2014

机译：TweetNorm_es语料库：用于西班牙微文本规范化的带注释语料库
5. A Corpus-based Study of the Gender Assignment of Nominal Anglicisms in Brazilian Portuguese [D] . Skahill, Taryn Marie. 2020

机译：基于语料库的巴西葡萄牙语义法治性别分配研究
6. An Outbreak of Fearsome Photos and Headlines: Ebola and Local Newspapers in West Africa [O] . Eric S. Halsey 2016

机译：令人恐惧的照片和头条新闻爆发：埃博拉病毒和西非当地报纸
7. Anglicisms in CREA: a Quantitative Analysis in Spanish Newspapers [O] . Núñez Nogueroles Eugenia Esperanza 2016

机译：CREA中的英语主义：西班牙报纸的定量分析

An Annotated Corpus of Emerging Anglicisms in Spanish Newspaper Headlines

摘要

著录项

相似文献

相关主题

期刊订阅