首页> 外文会议>Asian Language Processing, 2009. IALP '09 >Exploring the Effects of Text Clustering on On-Line Military News Based on Quantitative Association Rule
【24h】

Exploring the Effects of Text Clustering on On-Line Military News Based on Quantitative Association Rule

机译:基于定量关联规则的文本聚类对在线军事新闻的影响

获取原文

摘要

Text clustering is an automatic technique to group texts using the approach of feature extraction and term connection to calculate the similarities among subject contents of texts. Since the properties of terms in Chinese text (e.g. segmentation and annotation) are not as clear as the other languages, extracting and distinguishing features from Chinese text is therefore much more difficult, which greatly impacts the effects of clustering. From the perspective of military news, this paper applies both quantitative association rule and hierarchical agglomerative algorithm to cluster Chinese news published in Youth Daily News, and the application results are compared with those by the traditional vector space model approach and by the general association rule approach, respectively. F-measure is used as evaluation metric in the experiments. Experimental results show that the quantitative association rule approach performs more accurately than both the vector space model and association rule in text automatic clustering.
机译:文本聚类是一种自动的技术,它使用特征提取和术语连接的方法对文本进行分组,以计算文本主题内容之间的相似度。由于中文文本中的术语属性(例如分段和注释)不像其他语言那么清晰,因此从中文文本中提取和区分特征要困难得多,这极大地影响了聚类的效果。从军事新闻的角度出发,本文应用定量关联规则和层次集结算法对《青年报》发表的中文新闻进行聚类,并与传统向量空间模型法和一般关联规则法进行了比较。 , 分别。在实验中,将F度量用作评估指标。实验结果表明,在文本自动聚类中,定量关联规则方法比矢量空间模型和关联规则更准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号