首页> 外文学位 >Inferring answer quality, answerer expertise, and ranking in question answer social networks.
【24h】

Inferring answer quality, answerer expertise, and ranking in question answer social networks.

机译:推断回答质量,回答者专业知识以及对问题进行回答的社交网络的排名。

获取原文
获取原文并翻译 | 示例

摘要

Search has become ubiquitous mainly because of its usage simplicity. Search has made great strides in making information gathering relatively easy and without a learning curve. Question answering services/communities (termed CQA services or Q/A networks; e.g., Yahoo! Answers, Stack Overflow) have come about in the last decade as yet another way to search. Here the intent is to obtain good/high quality answers (from users with different levels of expertise) for a question when posed, or to retrieve answers from an archived Q/A repository. To make use of these services (and archives) effectively as an alternative to search, it is imperative that we develop a framework including techniques and algorithms for identifying quality of answers as well as the expertise of users answering questions. Finding answer quality is critical for archived data sets for accessing their value as stored repositories to answer questions. Meanwhile, determining the expertise of users is extremely important (and more challenging) for routing queries in real-time which is very important to these Q/A services -- both paid and free. This problem entails an understanding of the characteristics of interactions in this domain as well as the structure of graphs derived from these interactions. These graphs (termed Ask-Answer graphs in this thesis) have subtle differences from web reference graphs, paper citation graphs, and others. Hence it is imperative to design effective and efficient ranking approaches for these Q/A network data sets to help users retrieve/search for meaningful information.;The objective of this dissertation is to push the state-of-the-art in the analysis of Q/A social network data sets in terms of theory, semantics, techniques/algorithms, and experimental analysis of real-world social interactions. We leverage "participant characteristics" as the social community is dynamic with participants changing over a period of time and answering questions at their will. The participant behavior seems to be important for inferring some of the characteristics of their interaction.;First, our research work has determined that temporal features make a significant difference in predicting the quality of answers because the answerer's (or participant's) current behavior plays an important role in identifying the quality of an answer. We present learning to rank approaches for predicting answer quality as compared to traditional classification approaches and establish their superiority over currently-used classification approaches. Second, we discuss the difference between ask-answer graphs and web reference graphs and propose the ExpertRank framework and several approaches using domain information to predict the expertise level of users by considering both answer quality and graph structure. Third, current approaches infer expertise using traditional link-based methods such as PageRank or HITS. However, these approaches only identify global experts, which are termed generalists, in CQA services. The generalist may not be the best person to answer an arbitrary question. If a question contains several important concepts, it is meaningful for a person who is an expert in these concepts to answer that question. This thesis proposes techniques to identify experts at the concept level as a basic building block. This is critical as it can be used as a basis for inferring expertise at different levels using the derived concept rank. For example, a question can be viewed as a collection of a few important concepts. For answering a question, we use the ConceptRank framework to identify specialists for answering that question. This can be generalized using concept taxonomy for classifying topics, areas, and other larger concepts using the primary concept of coverage.;Ranking is central to the problems addressed in this thesis. Hence, we analyze the motivation behind traditional link-based approaches, such as HITS. We argue that these link-based approaches correspond to statistical information representing the opinion of web writers for these web resources. In contrast, we address the ranking problem in web and social networks by using the ILP (in-link probability) and OLP (out-link probability) of a graph to help understand HITS approach in contexts other than web graphs. We have further established that the two probabilities identified correspond to the hub and authority vectors of the HITS approach. We have used the standard Non-negative Matrix Factorization (NMF) to calculate these two probabilities for each node. Our experimental results and theoretical analysis validate the relationship between ILOD approach and HITS algorithm.
机译:搜索已变得无处不在,主要是因为其使用简便。搜索在使信息收集相对容易且没有学习曲线的方面取得了长足的进步。问答服务/社区(称为CQA服务或Q / A网络;例如Yahoo! Answers,堆栈溢出)已成为过去十年的另一种搜索方式。这里的目的是在提出问题时(从具有不同专业知识水平的用户那里)获得良好/高质量的答案,或者从已归档的Q / A存储库中检索答案。为了有效地利用这些服务(和档案)作为搜索的替代方法,我们必须开发一个框架,其中包括用于识别答案质量以及用户回答问题的专业知识的技术和算法。寻找答案质量对于归档数据集访问其作为存储库来回答问题的价值至关重要。同时,确定用户的专业知识对于实时路由查询非常重要(且更具挑战性),这对于这些Q / A服务(收费和免费)非常重要。这个问题需要理解此域中的交互特征以及从这些交互中得出的图的结构。这些图(在本论文中称为Ask-Answer图)与Web参考图,论文引文图及其他参考图有细微的差异。因此,有必要为这些Q / A网络数据集设计有效,高效的排序方法,以帮助用户检索/搜索有意义的信息。;本论文的目的是推动最新的Q / A网络数据分析。 Q / A社交网络数据集,包括理论,语义,技术/算法以及对现实世界中社交互动的实验分析。我们利用“参与者特征”,因为社交社区是动态的,参与者会在一段时间内变化并自愿回答问题。参与者的行为对于推断其互动的某些特征似乎很重要。首先,我们的研究工作已经确定,时间特征在预测答案的质量方面具有显着差异,因为应答者(或参与者)的当前行为起着重要的作用。在确定答案质量中的作用。与传统的分类方法相比,我们目前正在学习对预测答案质量的方法进行排名,并确立它们相对于当前使用的分类方法的优越性。其次,我们讨论了问答图和网络参考图之间的区别,并提出了ExpertRank框架和几种使用域信息来通过考虑答案质量和图结构来预测用户专业知识水平的方法。第三,当前的方法使用诸如PageRank或HITS之类的传统基于链接的方法来推断专业知识。但是,这些方法仅确定CQA服务中的全球专家,即通才。通才可能不是回答任意问题的最佳人选。如果一个问题包含几个重要的概念,那么对于这些​​概念的专家来说,回答这个问题就很有意义。本文提出了在概念层次上识别专家的基本技术。这很关键,因为它可以用作使用派生的概念等级来推断不同级别专业知识的基础。例如,一个问题可以看作是一些重要概念的集合。为了回答问题,我们使用ConceptRank框架来确定要回答该问题的专家。可以使用概念分类法将其概括为主题,并使用覆盖的主要概念对主题,领域和其他较大概念进行分类。;排名是本文所解决问题的核心。因此,我们分析了传统的基于链接的方法(如HITS)背后的动机。我们认为,这些基于链接的方法对应于代表Web作者对这些Web资源的观点的统计信息。相比之下,我们通过使用图的ILP(链接概率)和OLP(链接概率)来解决Web和社交网络中的排名问题,以帮助理解Web图表以外的环境中的HITS方法。我们进一步确定,识别出的两个概率对应于HITS方法的中心和权威向量。我们已经使用标准的非负矩阵分解(NMF)来计算每个节点的这两个概率。我们的实验结果和理论分析验证了ILOD方法与HITS算法之间的关系。

著录项

  • 作者

    Cai, Yuanzhe.;

  • 作者单位

    The University of Texas at Arlington.;

  • 授予单位 The University of Texas at Arlington.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 187 p.
  • 总页数 187
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号