首页> 外文期刊>Sensors and materials >Convolutional-neural-network-based Multilabel Text Classification for Automatic Discrimination of Legal Documents
【24h】

Convolutional-neural-network-based Multilabel Text Classification for Automatic Discrimination of Legal Documents

机译:基于卷积的神经网络的多包文本分类,用于自动歧视法律文件

获取原文
获取原文并翻译 | 示例
           

摘要

Law courts spend too much time reading documents and judging the type of legal cases. This problem becomes more serious as a crime can be classified into several categories at the same time. Thus, legal documents need a multilabel classification. We propose a multilabel text classification model based on multilabel text convolutional neural network (MLTCNN). We scan legal documents and convert them to text data using optical character recognition (OCR) with a charge-coupled device (CCD) sensor. Then, we use Jieba, a word segmentation tool of Chinese letters, and TensorFlow VocabularyProcessor to generate vocabularies. Then, the case description after segmenting each word is mapped into a word index in the vocabularies. We use a word index vector as an input to the MLTCNN. Lastly, we adopt multiple sigmoid functions for multiple binary classifications. The result shows our method to be efficient in finding errors and deviations for similar cases among district courts. This study provides a new method to improve the legal service and to enable fairer law enforcement.
机译:法院花费太多时间阅读文件并判断法律案件的类型。这种问题变得更加严重,因为犯罪可以同时分为几个类别。因此,法律文件需要一个多书分类。我们提出了一种基于Multilabel文本卷积神经网络(MLTCNN)的多书文本分类模型。我们扫描法律文档并将它们转换为使用光学字符识别(OCR)与电荷耦合器件(CCD)传感器的文本数据转换为文本数据。然后,我们使用Jieba,中文字母的单词分段工具,以及Tensorflow VocaBularyProcessor来生成词汇表。然后,将每个单词分段后的案例描述映射到词汇表中的单词索引中。我们使用单词索引向量作为MLTCNN的输入。最后,我们采用多个二进制分类的多个符合函数函数。结果表明,我们的方法在寻找地区法院之间的类似案例的错误和偏差方面是有效的。本研究提供了一种改进法律服务的新方法,并能够实现更公平的执法。

著录项

  • 来源
    《Sensors and materials》 |2020年第8期|2659-2672|共14页
  • 作者单位

    School of Informatics Xiamen University Fujian 361005 China;

    School of Informatics Xiamen University Fujian 361005 China;

    School of Informatics Xiamen University Fujian 361005 China;

    School of Informatics Xiamen University Fujian 361005 China;

    College of Mathematics and Information Engineering Longyan University Fujian 364012 China;

  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    multilabel learning; text classification; word embedding;

    机译:多书学习;文本分类;单词嵌入;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号