首页> 外文会议>IEEE International Conference on Semantic Computing >A Comparative Study of Deep Neural Network Models on Multi-Label Text Classification in Finance
【24h】

A Comparative Study of Deep Neural Network Models on Multi-Label Text Classification in Finance

机译:深度神经网络模型对金融多标题文本分类的比较研究

获取原文

摘要

Multi-Label Text Classification (MLTC) is a well-known NLP task that allows the classification of texts into multiple categories indicating their most relevant domains. However, training model tasks on texts from web user deal with redundancy or ambiguity of linguistic information. In this work, we propose a comparative study about different neural network models for a multi-label text categorisation task in finance domain. Our main contribution consists of presenting a new annotated dataset that contains ∼26k posts from users associated to finance categories. To build that dataset, we defined 10 specific-domain categories that cover financial texts. To serve as a baseline, we present a comparative study analysing both the performance and training time of different learning models for the task of multilabel text categorisation on the new dataset. The results show that transformer-based language models outperformed RNN-based neural networks in all scenarios in terms of precision. However, transformers took much more time than RNN models to train an epoch model.
机译:多标签文本分类(MLTC)是一个着名的NLP任务,允许文本分类为指示最相关域的多个类别。但是,在Web用户处理文本上培训模型任务处理语言信息的冗余或歧义。在这项工作中,我们提出了一个关于金融域中多标签文本分类任务的不同神经网络模型的比较研究。我们的主要贡献包括呈现包含的新注释数据集来自与财务类别相关联的用户的26k帖子。要构建该数据集,我们定义了涵盖财务文本的10个特定域类别。要作为基线,我们提供了一个比较研究,分析了对新数据集的多标签文本分类任务的不同学习模型的性能和培训时间。结果表明,基于变压器的语言模型在精度方面的所有场景中都表现出基于RNN的神经网络。然而,变压器比RNN模型花费了更多的时间来训练一个时代模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号