首页> 外国专利> BERT TEXTRANK BASED CORE SENTENCE EXTRACTION METHOD AND DEVICE USING BERT SENTENCE EMBEDDING VECTOR

BERT TEXTRANK BASED CORE SENTENCE EXTRACTION METHOD AND DEVICE USING BERT SENTENCE EMBEDDING VECTOR

机译:基于BERT TEXTRANK嵌入向量的BERT句子的核心句提取方法和设备

摘要

The present invention relates to a method and apparatus for extracting a TextRank-based core sentence using a sentence embedding vector of an ERT, and the method for extracting a TextRank-based core sentence using a sentence embedding vector of a BERT according to an embodiment of the present invention is executed in a computing device A computer implemented method for extracting a core sentence to be used, comprising: a first step of dividing natural language data from which a core sentence is to be extracted into sentence units; a second step of adding a special classification token (CLS) before each of the divided sentences; a third step of converting each sentence to which the special classification token (CLS) is added into a sentence vector using a sentence vector transformation model; a fourth step of constructing a similarity matrix by calculating the similarity between sentences based on the sentence vectors; a fifth step of calculating the importance of each sentence by applying the similarity matrix to TextRank; and a sixth step of extracting a key sentence according to the calculated importance.
机译:本发明涉及一种使用句子的嵌入矢量提取基于Textrank的核心句子的方法和装置,以及使用根据实施例的彼此的句子嵌入矢量提取基于Textrank的核心句子的方法本发明在计算设备中执行,计算机实现的用于提取要使用的核心句子的计算机实现的方法,包括:将要将核心句子的自然语言数据划分为句子单元的第一步;在每个分割句子之前添加特殊分类令牌(CLS)的第二步;使用句子矢量转换模型将特殊分类令牌(CLS)的每个句子转换为句子向量的第三步;通过计算基于句子向量的句子之间的相似性构建相似性矩阵的第四步;通过将相似性矩阵应用于Textrank来计算每个句子的重要性的第五步;和根据计算的重要性提取密钥句的第六步。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号