首页> 外国专利> AUTOMATED CONTENT TAGGING WITH LATENT DIRICHLET ALLOCATION OF CONTEXTUAL WORD EMBEDDINGS

AUTOMATED CONTENT TAGGING WITH LATENT DIRICHLET ALLOCATION OF CONTEXTUAL WORD EMBEDDINGS

机译:自动化内容标记与潜在的dirichlet分配上下文单词嵌入式

摘要

Dynamic content tags are generated as content is received by a dynamic content tagging system. A natural language processor (NLP) tokenizes the content and extracts contextual N-grams based on local or global context for the tokens in each document in the content. The contextual N-grams are used as input to a generative model that computes a weighted vector of likelihood values that each contextual N-gram corresponds to one of a set of unlabeled topics. A tag is generated for each unlabeled topic comprising the contextual N-gram having a highest likelihood to correspond to that unlabeled topic. Topic-based deep learning models having tag predictions below a threshold confidence level are retrained using the generated tags, and the retrained topic-based deep learning models dynamically tag the content.
机译:生成动态内容标记作为内容由动态内容标记系统接收到内容。 自然语言处理器(NLP)基于内容中每个文档中的令牌的本地或全局上下文授予内容并提取上下文n-grams。 上下文n-gram用作生成模型的输入,该生成模型计算每个上下文n-gram对应于一组未标记主题之一的似然值的加权矢量。 为每个未标记的主题生成一个标签,该主题包括具有最高可能性的上下文n-gram以对应于该未标记的主题。 基于主题的深度学习模型,使用生成的标签再培训具有低于阈值置信水平的标签预测的模型,并且棘手的基于主题的深度学习模型动态标记了内容。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号