首页> 外国专利> BERT TEXTRANK BASED CORE SENTENCE EXTRACTION METHOD AND DEVICE USING BERT SENTENCE EMBEDDING VECTOR

BERT TEXTRANK BASED CORE SENTENCE EXTRACTION METHOD AND DEVICE USING BERT SENTENCE EMBEDDING VECTOR

机译：基于BERT TEXTRANK嵌入向量的BERT句子的核心句提取方法和设备

页面导航

摘要
著录项
相似文献

摘要

The present invention relates to a method and apparatus for extracting a TextRank-based core sentence using a sentence embedding vector of an ERT, and the method for extracting a TextRank-based core sentence using a sentence embedding vector of a BERT according to an embodiment of the present invention is executed in a computing device A computer implemented method for extracting a core sentence to be used, comprising: a first step of dividing natural language data from which a core sentence is to be extracted into sentence units; a second step of adding a special classification token (CLS) before each of the divided sentences; a third step of converting each sentence to which the special classification token (CLS) is added into a sentence vector using a sentence vector transformation model; a fourth step of constructing a similarity matrix by calculating the similarity between sentences based on the sentence vectors; a fifth step of calculating the importance of each sentence by applying the similarity matrix to TextRank; and a sixth step of extracting a key sentence according to the calculated importance.

机译：本发明涉及一种使用句子的嵌入矢量提取基于Textrank的核心句子的方法和装置，以及使用根据实施例的彼此的句子嵌入矢量提取基于Textrank的核心句子的方法本发明在计算设备中执行，计算机实现的用于提取要使用的核心句子的计算机实现的方法，包括：将要将核心句子的自然语言数据划分为句子单元的第一步;在每个分割句子之前添加特殊分类令牌（CLS）的第二步;使用句子矢量转换模型将特殊分类令牌（CLS）的每个句子转换为句子向量的第三步;通过计算基于句子向量的句子之间的相似性构建相似性矩阵的第四步;通过将相似性矩阵应用于Textrank来计算每个句子的重要性的第五步;和根据计算的重要性提取密钥句的第六步。

著录项

公开/公告号KR20210151281A

专利类型
公开/公告日2021-12-14

原文格式PDF
申请/专利权人 동국대학교 산학협력단;주식회사 인사이저;
展开▼

申请/专利号KR20200067679
发明设计人 손영두;양승호;신석원;
展开▼

申请日2020-06-04
分类号G06F40/279;G06F16/33;G06F40/205;G06F40/258;
国家 KR
入库时间 2022-08-24 22:48:40

相似文献

专利
外文文献
中文文献