首页> 外国专利> CLASSIFICATION MODEL-BASED FIELD EXTRACTION METHOD AND APPARATUS, ELECTRONIC DEVICE, AND MEDIUM

CLASSIFICATION MODEL-BASED FIELD EXTRACTION METHOD AND APPARATUS, ELECTRONIC DEVICE, AND MEDIUM

机译:基于分类的基于模型的场提取方法和装置,电子设备和介质

摘要

A classification model-based field extraction method and apparatus, an electronic device, and a medium. The method comprises: when a public field extraction request is received, extracting a plurality of texts and preprocessing the texts; integrating the preprocessed texts to obtain a text fragment; sequentially selecting target values from a configuration interval, and extracting phrases from the text fragment by using the target values as extraction lengths (S12); calculating the degree of solidification of the phrases, and determining the phrases having the degree of solidification greater than a first threshold as first phrases (S13); calculating the frequency of the first phrases in the text fragment, and determining the first phrase having the frequency greater than a second threshold as a second phrase (S14); obtaining the context information of the second phrase in the text fragment (S15); inputting the second phrase, the frequency of the second phrase, and the context information into a pre-trained classification model so as to obtain an output result corresponding to the second phrase (S16); and when the output result is a public field, analyzing the second phrase to obtain an analysis result. Further involved is a blockchain technique, and an analysis result is stored in a blockchain.
机译:基于分类模型的场提取方法和装置,电子设备和介质。该方法包括:当接收到公共场所提取请求时,提取多个文本并预处理文本;集成预处理文本以获取文本片段;顺序地从配置间隔中选择目标值,并通过使用目标值作为提取长度来从文本片段中提取短语(S12);计算短语的凝固程度,并确定凝固程度大于第一阈值作为第一短语的短语(S13);计算文本片段中的第一短语的频率,并确定作为第二短语的频率大于第二阈值的第一短语(S14);获取文本片段中第二短语的上下文信息(S15);将第二短语,第二短语的频率和上下文信息输入到预先训练的分类模型中,以便获得与第二短语对应的输出结果(S16);当输出结果是公共字段时,分析第二短语以获得分析结果。进一步涉及的是一个区块链技术,分析结果存储在区块链中。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号