首页> 外国专利> Systems and Methods for Determining Lexical Associations Among Words in a Corpus

Systems and Methods for Determining Lexical Associations Among Words in a Corpus

机译:确定语料库中单词间词汇联想的系统和方法

摘要

Systems and methods are provided for identifying one or more target words of a corpus that have a lexical relationship to a plurality of provided cue words. The cue words and statistical lexical information derived from a corpus of documents are analyzed to determine candidate words that have a lexical association with the cue words. The statistical information includes numerical values indicative of probabilities of word pairs appearing together as adjacent words in a well-formed text or appearing together within a paragraph of a well-formed text. For each candidate word, a statistical association score between the candidate word and each of the cue words is determined. An aggregate score for each of the candidate words is determined based on the statistical association scores. One or more of the candidate words are selected to be the one or more target words based on the aggregate scores.
机译:提供了用于识别语料库的一个或多个目标词的系统和方法,该词库与多个提供的提示词具有词法关系。分析提示词和从文档语料库得出的统计词汇信息,以确定与该提示词具有词法关联的候选词。统计信息包括指示单词对的概率的数值,这些单词对在格式正确的文本中一起作为相邻单词出现或在格式良好的文本的段落内一起出现。对于每个候选词,确定候选词和每个提示词之间的统计关联分数。基于统计关联分数来确定每个候选单词的总分数。基于该总分,选择一个或多个候选词作为一个或多个目标词。

著录项

  • 公开/公告号US2015347385A1

    专利类型

  • 公开/公告日2015-12-03

    原文格式PDF

  • 申请/专利权人 EDUCATIONAL TESTING SERVICE;

    申请/专利号US201514726928

  • 发明设计人 MICHAEL FLOR;BEATA BEIGMAN KLEBANOV;

    申请日2015-06-01

  • 分类号G06F17/27;

  • 国家 US

  • 入库时间 2022-08-21 14:33:38

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号