Aspect and Entity Extraction from Opinion Documents.

机译：从意见文档中提取方面和实体。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Opinion mining has been an active research area in Web mining and Natural Language Processing (NLP) in recent years. In this thesis, we present a comprehensive study of aspect and entity extraction from opinion documents for opinion mining. We first introduce the aspect-based opinion mining model. Then, we propose a new method for aspect extraction and ranking, which is based on language patterns and dependency grammar. Meanwhile, it is capable of ranking extracted aspects by their importance, i.e. relevancy and frequency. In addition, we discover that there are two kinds of special product aspects in some domains. One is noun aspect implying opinion. The other is the resource term. Novel extraction algorithms are proposed to identify them from opinion documents. In terms of entity extraction task, it is similar to the classic named entity extraction (NER) problem. However, there is a major difference. In a typical opinion mining application, the users often want to find opinions on some competing entities, e.g., competing or relevant products. This implies that the discovered entities must be of the same type/class. Basically, this is a set expansion problem. To deal with this problem, we present two set expansion algorithms for entity extraction in opinion documents. One is based on positive and unlabeled (PU) learning model. The other is based on Bayesian Sets. We also discuss extracting topic documents from a collection. Opinion mining system crawls and indexes opinion documents first and then used for different specific tasks later. Typically, the documents are not well categorized because one does not know what the future tasks will be. Normally, keyword search is used to find relevant opinion documents for analysis. However, the documents that are retrieved in this way can have both low recall and low precision. Another way is to train a document classifier. But the training procedure is time-consuming and labor-intensive. We propose an unsupervised technique to solve this problem based on a new PU learning algorithm.

机译：近年来，观点挖掘一直是Web挖掘和自然语言处理（NLP）领域的活跃研究领域。在本文中，我们对从意见文档中抽取方面和实体进行了全面的研究，以进行意见挖掘。我们首先介绍基于方面的意见挖掘模型。然后，我们提出了一种基于语言模式和依存语法的方面提取和排序的新方法。同时，它能够按重要性，即相关性和频率对提取的方面进行排名。此外，我们发现在某些领域中有两种特殊的产品方面。一种是暗示观点的名词方面。另一个是资源术语。提出了新颖的提取算法以从意见文档中识别它们。在实体提取任务方面，它类似于经典的命名实体提取（NER）问题。但是，有很大的不同。在典型的观点挖掘应用中，用户经常希望找到关于某些竞争实体（例如竞争或相关产品）的观点。这意味着发现的实体必须具有相同的类型/类。基本上，这是一个集合扩展问题。为了解决这个问题，我们提出了两种集合扩展算法，用于在意见文档中提取实体。一种是基于正面和无标签（PU）学习模型。另一个基于贝叶斯集。我们还将讨论从集合中提取主题文档。意见挖掘系统首先对意见文档进行爬网和索引，然后再用于不同的特定任务。通常，文档分类不充分，因为人们不知道将来的任务是什么。通常，关键字搜索用于查找相关意见文档以进行分析。但是，以这种方式检索的文档可能具有较低的召回率和较低的精度。另一种方法是训练文档分类器。但是培训过程既费时又费力。我们提出了一种基于新的PU学习算法来解决此问题的无监督技术。

著录项

作者
Zhang, Lei.;
展开▼
作者单位

University of Illinois at Chicago.;

展开▼
授予单位 University of Illinois at Chicago.;
学科 Computer Science.
学位 Ph.D.
年度 2012
页码 121 p.
总页数 121
原文格式 PDF
正文语种 eng
中图分类遥感技术;
关键词

相似文献

外文文献
中文文献
专利

1. Opinion Mining for Extraction of Opinion Word and Opinion Target Using Modified Word Alignment Joint Aspect and Sentiment Model [J] . Anette Regina, P.Sengottuvelan International Journal on Computer Science and Engineering . 2017,第11期

机译：基于改进的词对齐联合方面和情感模型的观点词和观点目标的观点挖掘
2. Implicit Aspect Indicator Extraction for Aspect-based Opinion Mining [J] . Ivan Cruz, Alexander Gelbukh, Grigori Sidorov International journal of computational linguistics and applications . 2014,第2期

机译：基于方面的观点挖掘的隐式方面指标提取
3. Word-of-Mouth Understanding: Entity-Centric Multimodal Aspect-Opinion Mining in Social Media [J] . Fang Quan, Xu Changsheng, Sang Jitao, Multimedia, IEEE Transactions on . 2015,第12期

机译：口碑理解：社交媒体中以实体为中心的多模式方面观点挖掘
4. Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions [C] . Hongjie Cai, Rui Xia, Jianfei Yu Annual Meeting of the Association for Computational Linguistics;International Joint Conference on natural Language Processing . 2021

机译：方面 - 类别意见情绪四重提取，具有隐含方面和意见
5. Learning for information extraction: From named entity recognition and disambiguation to relation extraction. [D] . Bunescu, Razvan Constantin. 2007

机译：学习信息提取：从命名实体识别和歧义消除到关系提取。
6. Lifelong-RL: Lifelong Relaxation Labeling for Separating Entities and Aspects in Opinion Targets [O] . Lei Shu, Bing Liu, Hu Xu, -1

机译：Lifelong-RL：终身放松标签用于分离意见目标中的实体和方面
7. Opinion Transmission Network for Jointly Improving Aspect-Oriented Opinion Words Extraction and Sentiment Classification [O] . Chengcan Ying, Zhen Wu, Xinyu Dai, 2020

机译：意见传输网络共同提高方面的意见词提取与情感分类

Aspect and Entity Extraction from Opinion Documents.

摘要

著录项

相似文献

相关主题

期刊订阅