首页>
外国专利>
SENTENCE VECTOR CALCULATION METHOD BASED ON CHI-SQUARE TEST, AND TEXT CLASSIFICATION METHOD AND SYSTEM
SENTENCE VECTOR CALCULATION METHOD BASED ON CHI-SQUARE TEST, AND TEXT CLASSIFICATION METHOD AND SYSTEM
展开▼
机译:基于卡方检验的句子矢量计算方法,文本分类方法及系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed are a sentence vector calculation method based on a chi-square test, and a text classification method and system, the method involving: carrying out word segmentation processing on the current text and removing stop words to obtain a word segmentation resu calculating a word vector of each word in the word segmentation resu calculating a chi-square value between each word vector and a preset category, and dividing the word vectors into feature words and non-feature words according to the chi-square values; calculating usage frequency of the feature words in the preset category, giving a first weight value to the feature words according to the usage frequency, and giving a second weight value to the non-feature words, wherein the first weight value is greater than the second weight value; and calculating a weighted mean value of all word vectors according to the word vectors of the feature words and the non-feature words and the corresponding weight values, and taking same as a sentence vector of the current text, thereby improving a weight value of the sentence vector in a feature dimension, reducing mutual interference between the word vectors in text information and greatly improving the accuracy of text classification.
展开▼