首页> 外文会议>International Conference on Communication and Electronics Systems >Improved clustering technique using metadata for text mining
【24h】

Improved clustering technique using metadata for text mining

机译:使用元数据进行文本挖掘的改进聚类技术

获取原文

摘要

In many text mining applications, information from Document is present in the form of Text along with Side Information or Metadata. Examples of this side information include links to other web pages, title of the document, author name or date of Publication which are present in the text document. Such metadata may possess a lot of information for the clustering purposes. But this Side information may be sometimes noisy. Using such Side Information for producing clusters without filtering it, can result to bad quality of Clusters. So we use an efficient Feature Selection method to perform the mining process to select that Side Information which is useful for Clustering so as to maximize the advantages from using it. The proposed technique makes use of the process of Two-mode clustering which is a data mining technique that allows producing groups by Clustering both Text and Side Information.
机译:在许多文本挖掘应用程序中,来自文档的信息以文本以及辅助信息或元数据的形式出现。此辅助信息的示例包括文本文档中存在的其他网页链接,文档标题,作者姓名或出版日期。这样的元数据可能拥有大量信息以用于聚类目的。但是此补充信息有时可能很嘈杂。使用此类辅助信息来生成群集而不对其进行过滤可能会导致群集质量下降。因此,我们使用一种有效的“特征选择”方法来执行挖掘过程,以选择对聚类有用的“边信息”,从而最大程度地利用它。所提出的技术利用了双模式聚类的过程,该过程是一种数据挖掘技术,它允许通过聚类文本和边信息来产生组。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号