Evaluation of Deep Learning Models for Hostility Detection in Hindi Text

机译：印地文中敌意检测深度学习模型的评价

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The social media platform is a convenient medium to express personal thoughts and share useful information. It is fast, concise, and has the ability to reach millions. It is an effective place to archive thoughts, share artistic content, receive feedback, promote products, etc. Despite having numerous advantages these platforms have given a boost to hostile posts. Hate speech and derogatory remarks are being posted for personal satisfaction or political gain. The hostile posts can have a bullying effect rendering the entire platform experience hostile. Therefore detection of hostile posts is important to maintain social media hygiene. The problem is more pronounced languages like Hindi which are low in resources. In this work, we present approaches for hostile text detection in the Hindi language. The proposed approaches are evaluated on the Constraint@AAAI 2021 Hindi hostility detection dataset. The dataset consists of hostile and nonhostile texts collected from social media platforms. The hostile posts are further segregated into overlapping classes of fake, offensive, hate, and defamation. We evaluate a host of deep learning approaches based on CNN, LSTM, and BERT for this multi-label classification problem. The pre-trained Hindi fast text word embeddings by IndicNLP and Facebook are used in conjunction with CNN and LSTM models. Two variations of pre-trained multilingual transformer language models mBERT and IndicBERT are used. We show that the performance of BERT based models is best. Moreover, CNN and LSTM models also perform competitively with BERT based models.

机译：社交媒体平台是一种表达个人思想的便捷媒介，并分享有用的信息。它快速，简洁，并有能力达到数百万。它是一个有效的归档思想，共享艺术内容，接收反馈，推广产品等。尽管有许多优势，但这些平台已经提升到敌对帖子。讨厌言论和贬义言论正在为个人满足或政治收益发布。敌对帖子可以具有欺凌效果，呈现整个平台经验敌对。因此，检测敌对职位对于维持社交媒体卫生非常重要。问题是更明显的语言，如资源中的印度。在这项工作中，我们在印地语语言中提出了敌对文本检测的方法。在AAAI 2021 Hindi敌意检测数据集中评估所提出的方法。 DataSet由来自社交媒体平台收集的敌对和非硬盘文本组成。敌对帖子进一步被隔离成重叠的假，冒犯，仇恨和诽谤。我们为该多标签分类问题的CNN，LSTM和BERT评估了一系列深度学习方法。 PRET训练的印地语快速文本单词嵌入式NExDNLP和Facebook与CNN和LSTM模型结合使用。使用预先训练的多语言变压器语言模型MBERT和indectbert的两种变体。我们表明基于BERT的模型的性能最好。此外，CNN和LSTM模型也与基于BERT的模型竞争地执行。

著录项

来源
《International Conference for Convergence in Technology》|2021年|1-5|共5页
会议地点
作者
Ramchandra Joshi; Rushabh Karnavat; Kaustubh Jirapure; Ravirai Joshi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Social networking (online); Bit error rate; Natural languages; Bidirectional control; Rendering (computer graphics); Task analysis;

机译：深入学习;社交网络（在线）;误码率;自然语言;双向控制;渲染（计算机图形）;任务分析;

相似文献

外文文献
中文文献
专利

1. Error tolerant global search incorporated with deep learning algorithm to automatic Hindi text summarisation [J] . J. Anitha, P.V.G.D. Prasad Reddy, M.S. Prasad Babu International Journal of Business Intelligence and Data Mining . 2019,第3期

机译：结合深度学习算法的容错全局搜索，可自动印地文文本摘要
2. Automated Anomaly Detection and Localization in Sewer Inspection Videos Using Proportional Data Modeling and Deep Learning-Based Text Recognition [J] . Moradi Saeed, Zayed Tarek, Nasiri Fuzhan, Journal of Infrastructure Systems . 2020,第3期

机译：使用比例数据建模和基于深度学习的文本识别下水道检查视频中的自动化异常检测和定位
3. Deep Learning-Based Document Modeling for Personality Detection from Text [J] . Navonil Majumder, Soujanya Poria, Alexander Gelbukh, IEEE intelligent systems . 2017,第2期

机译：基于深度学习的文档模型用于文本个性检测
4. Opinionated Text Classification For Hindi Tweets Using Deep Learning [C] . Sindhu C, Shilpi Adak, Soumya Celina Tigga International Conference on Computing Methodologies and Communication . 2021

机译：使用深度学习的印地语推文的自传文本分类
5. Evaluation of Synthetic Training Data and Training-Data-Augmentation Techniques for Object Detection in Ground-Penetrating Radar Data using Deep-Learning Models [D] . Ruggiero, Jean. 2021

机译：使用深度学习模型评估用于地面穿透雷达数据的对象检测的综合训练数据和训练数据增强技术
6. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification [O] . Michel Oleynik, Amila Kugic, Zdenko Kasáč, 2019

机译：评估2018年N2C2临床文本分类共享任务的浅层和深度学习策略
7. Hostility Detection in Hindi Leveraging Pre-trained Language Models [O] . Ojasv Kamal, Adarsh Kumar, Tejas Vaidhya 2021

机译：印地地的敌意检测利用预先训练的语言模型

Evaluation of Deep Learning Models for Hostility Detection in Hindi Text

摘要

著录项

相似文献

相关主题

期刊订阅