首页> 外文会议>IEEE International Conference on Signal Processing >Discrimination of Chinese quantitative style features based on text clustering
【24h】

Discrimination of Chinese quantitative style features based on text clustering

机译:基于文本聚类的中国定量样式特征辨析

获取原文

摘要

The styles of “News Broadcast” and “Qiang Qiang Conversation between Three Individuals” are different. The former is broadcasting, while the latter is conversational. This paper collects the corpus of both programs and selects sentence length, word length and sentence-initial word POS as the characters to generate the text vectors. And the texts are clustered by the Euclidean distance and ward algorithm. The analysis showed that the sentence length, word length and sentence-initial word POS can be used as Chinese quantitative stylistic characters.
机译:&#x201c的风格;新闻广播” 和“ qiang qiang谈话三个人” 是不同的。 前者正在广播,而后者正在对话。 本文收集两个程序的语料库,然后选择句子长度,单词长度和句子初始字POS作为生成文本向量的字符。 并且文本由欧几里德距离和病房算法集群。 该分析表明,句子长度,字长和句子 - 初始字POS可以用作汉数风格字符。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号