首页> 外文会议>2nd workshop on sentiment analysis where AI meets psychology >Classification of Interviews - A Case Study on Cancer Patients
【24h】

Classification of Interviews - A Case Study on Cancer Patients

机译:访谈的分类-以癌症患者为例。

获取原文
获取原文并翻译 | 示例

摘要

With the rapid expansion of Web 2.0, a variety of documents abound online. Thus, it is important to find methods that can annotate and organize documents in meaningful ways to expedite the search process. A considerable amount of research on document classification has been conducted. However, this paper introduces the classification of interviews of cancer patients into several cancer diseases based on the features collected from the corpus. We have developed a corpus of 727 interviews collected from a web archive of medical articles. The TF-IDF features of unigram, bigram, trigram and emotion words as well as the SentiWordNet and Cosine similarity features have been used in training and testing of the classification systems. We have employed three different classifiers like £-NN, Decision Tree and Naive Bayes for classifying the documents into different classes of cancer. The experimental results obtain maximum accuracy of 99.31% tested on 73 documents of the test data.
机译:随着Web 2.0的迅速扩展,各种文档在线上大量存在。因此,重要的是找到能够以有意义的方式注释和组织文档以加快搜索过程的方法。关于文档分类的大量研究已经进行。但是,本文基于从语料库收集的特征,将对癌症患者的访谈分为几种癌症疾病。我们已经从医疗文章的网络档案库中收集了727个访谈的语料库。 unigram,bigram,trigram和情感词的TF-IDF功能以及SentiWordNet和余弦相似性功能已用于分类系统的训练和测试中。我们采用了三种不同的分类器,例如£-NN,Decision Tree和Naive Bayes,将文档分类为不同类别的癌症。实验结果在73个文档的测试数据上获得了99.31%的最大准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号