ACTSEA: Annotated Corpus for Tamil Sinhala Emotion Analysis

机译：actsea：泰米尔和僧伽罗大学的注释语料库

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The purpose of text emotion analysis is to detect and recognize the classification of feeling expressed in text. In recent years, there has been an increase in text emotion analysis studies for English language since data were abundant. Due to the growth of social media large amount data are now available for regional languages such as Tamil and Sinhala as well. However, these languages lack necessary annotated corpus for many NLP tasks including emotion analysis. In this paper, we present our scalable semi-automatic approach to create an annotated corpus named ACTSEA for Tamil and Sinhala to support emotion analysis. Alongside, our analysis on a sample of the produced data and the useful findings are presented for the low resourced NLP community to benefit. For ACTSEA, data were gathered from twitter platform and annotated manually after cleaning. We collected 600280 (Tamil) and 318308 (Sinhala) tweets in total which makes our corpus largest data collection which is currently available for these languages.

机译：文字情感分析的目的是检测和识别文本中感觉的分类。近年来，由于数据丰富，因此对英语进行了文本情感分析研究。由于社交媒体的增长，大量数据现在可用于泰米尔和僧伽达拉等区域语言。但是，这些语言缺乏必要的注释语料库，包括情感分析。在本文中，我们介绍了我们可扩展的半自动方法，以创建一个名为Actsea的泰米尔和僧伽拉的注释语料库，以支持情绪分析。除此之外，我们对所产生数据的样本和有用调查结果的分析是为了低资源的NLP社区受益。对于Actsea，数据从Twitter平台收集，清洁后手动注释。我们收集了600280（泰米尔）和318308（Sinhala）Tweets，这使得我们的语料库最大的数据收集目前可用于这些语言。

著录项

来源
《International Moratuwa Engineering Research Conference》|2019年|1 v.|共5页
会议地点
作者
Rajenthiran Jenarthanan; Yasas Senarath; Uthayasanker Thayasivam;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类输配电工程、电力网及电力系统;
关键词
emotion recognition; natural language processing; social networking (online); text analysis;

机译：情绪识别;自然语言处理;社交网络（在线）;文本分析;

相似文献

外文文献
中文文献
专利

1. Kāvi: An Annotated Corpus of Punjabi Poetry with Emotion Detection Based on ‘Navrasa’ [J] . Jatinderkumar R. Saini, Jasleen Kaur Procedia Computer Science . 2020,第5期

机译：Kāvi：基于'navrasa'的情感检测，旁遮普诗的注释语料库
2. Construction and Evaluation of Tamil Speech Emotion Corpus [J] . Vasuki P., Sambavi B., Joe Vijesh National Academy Science Letters . 2020,第6期

机译：泰米尔语音情绪语料库的建设与评价
3. Annotated corpus creation for sentiment analysis in code-mixed Hindi-English (Hinglish) social network data [J] . Neha Garg, Kamlesh Sharma Indian Journal of Science and Technology . 2020,第40期

机译：编码混合后印度英语（HINGISH）社交网络数据中的引向语料库创建
4. ACTSEA: Annotated Corpus for Tamil Sinhala Emotion Analysis [C] . Rajenthiran Jenarthanan, Yasas Senarath, Uthayasanker Thayasivam International Moratuwa Engineering Research Conference . 2019

机译：ACTSEA：用于泰米尔语和僧伽罗语情感分析的带注释语料库
5. Annotating a corpus of biomedical research texts: Two models of rhetorical analysis. [D] . White, Barbara Ellen. 2010

机译：注释生物医学研究文献集：修辞分析的两种模型。
6. Sri Lanka Eye Foundation: Booklets in English Sinhala Tamil [O] . 2003

机译：斯里兰卡眼基金会：英语僧伽罗语和泰米尔语的小册子
7. Annotated Corpus of Mesopotamian-Iraqi Dialect for Sentiment Analysis in Social Media [O] . Al-Khafaji Ali J Askar, Nilam Nur 2021

机译：社交媒体中的MesopotaMian-Iraqi方言的注释语料库中的情绪分析

ACTSEA: Annotated Corpus for Tamil Sinhala Emotion Analysis

摘要

著录项

相似文献

相关主题

期刊订阅