首页> 外国专利> AUTOMATED NONPARAMETRIC CONTENT ANALYSIS FOR INFORMATION MANAGEMENT AND RETRIEVAL

AUTOMATED NONPARAMETRIC CONTENT ANALYSIS FOR INFORMATION MANAGEMENT AND RETRIEVAL

机译:信息管理和检索的自动非参数内容分析

摘要

Embodiments of the invention utilize a feature-extraction approach and/or a matching approach in combination with a nonparametric approach to estimate the proportion of documents in each of multiple labeled categories with high accuracy. The feature-extraction approach automatically generates continuously valued text features optimized for estimating the category proportions, and the matching approach constructs a matched set that closely resembles a data set that is unobserved based on an observed set, thereby improving the degree to which the distributions of the observed and unobserved sets resemble each other.
机译:本发明的实施例利用特征提取方法和/或匹配方法与非参数方法相结合,以高精度估计多个标记类别中的每一个中的文档的比例。特征提取方法自动生成为评估类别比例而优化的连续值文本特征,并且匹配方法构造一个匹配集,该匹配集与基于观察集的未观察到的数据集非常相似,从而提高了分布的程度观察到的和未观察到的集合彼此相似。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号