首页>
外国专利>
Multi-document keyphrase extraction using partial mutual information
Multi-document keyphrase extraction using partial mutual information
展开▼
机译:使用部分互信息提取多文档关键字
展开▼
页面导航
摘要
著录项
相似文献
摘要
A keyphrase extraction system and method are provided. The system and method can be employed to create an automatic summary of a subset of document(s). The system can automatically extract a list of keyword(s) that can operate on multiple documents, and across many different domains. The system is unsupervised and requires no prior learning.;A term identifier identifies candidate terms (e.g., words and/or phrases) in the document subset which are used to form a document-term matrix. A probability computation component calculates probability values of: (1) the joint probability of a word (e.g., term) and a document, (2) the marginal probability of the word (e.g., term), and (3) the marginal probability of the document. Based on the probability values, a partial mutual information metric can be calculated for each candidate term. Based on the partial mutual information metric, one or more of the terms can be identified as summary keyphrases.
展开▼