首页> 外文学位 >An automatic email mining approach using semantic non-parametric K-Means++ clustering.

【24h】

An automatic email mining approach using semantic non-parametric K-Means++ clustering.

机译：使用语义非参数K-Means ++群集的自动电子邮件挖掘方法。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Email inboxes are now filled with huge varieties of voluminous messages and thus increasing the problem of "email overload" which places financial burden on companies and individuals. Email mining provides solution to email overload problem by automatically grouping emails into meaningful and similar groups based on email subjects and contents. Existing email mining systems such as Kernel-Selected clustering and BuzzTrack, do not consider the semantic similarity between email contents, also when large number of email messages are clustered to a single folder they retain the problem of email overload.;This thesis proposes a system named AEMS for automatic folder and sub-folder creation, indexing of the created folders with link to each folder and sub-folder, also an Apriori-based folder summarization containing important keywords from the folder. Thesis aims at solving email overload problem through semantic re-structuring of emails. In AEMS model, a novel approach named Semantic Non-parametric K-Means++ clustering is proposed for folder creation, which avoids, (1) random seed selection by selecting the seed according to email weights, and (2) pre-defined number of clusters using the similarity between the email contents. Experiments show the effectiveness and efficiency of the proposed techniques using large volumes of email datasets.;Keywords: Email Mining, Email Overload, Email Management, Data Mining, Clustering, Feature Selection, Folder Summarization.

机译：现在，电子邮件收件箱中充满了大量各种各样的邮件，因此增加了“电子邮件超载”的问题，这给公司和个人带来了财务负担。电子邮件挖掘通过根据电子邮件主题和内容将电子邮件自动分组为有意义的相似组，从而为电子邮件过载问题提供了解决方案。现有的电子邮件挖掘系统（如内核选择的聚类和BuzzTrack）没有考虑电子邮件内容之间的语义相似性，而且当将大量电子邮件聚类到单个文件夹时，它们仍然存在电子邮件过载的问题。名为AEMS的文件夹，用于自动创建文件夹和子文件夹，为创建的文件夹建立索引并链接到每个文件夹和子文件夹，还基于Apriori的文件夹摘要，其中包含该文件夹中的重要关键字。本文旨在通过电子邮件的语义重构来解决电子邮件过载问题。在AEMS模型中，提出了一种名为语义非参数K-Means ++聚类的新颖方法来创建文件夹，该方法避免了（1）通过根据电子邮件权重选择种子来随机选择种子，以及（2）预先定义的簇数使用电子邮件内容之间的相似性。实验证明了使用大量电子邮件数据集所提出技术的有效性和效率。关键词：电子邮件挖掘，电子邮件超载，电子邮件管理，数据挖掘，聚类，功能选择，文件夹摘要。

著录项

作者
Soni, Gunjan.;
展开▼
作者单位

University of Windsor (Canada).;

展开▼
授予单位 University of Windsor (Canada).;
学科 Information Technology.;Computer Science.
学位 M.Sc.
年度 2013
页码 104 p.
总页数 104
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Automatically detecting feature requests from development emails by leveraging semantic sequence mining [J] . Shi Lin, Chen Celia, Wang Qing, Requirements Engineering . 2021,第2期

机译：通过利用语义序列挖掘自动检测来自开发电子邮件的功能请求
2. Automatic generation of semantically enriched web pages by a text mining approach [J] . Hsin-Chang Yang Expert systems with applications . 2009,第6期

机译：通过文本挖掘方法自动生成语义丰富的网页
3. A method for automatic construction of learning contents in semantic web by a text mining approach [J] . Hsin-Chang Yang International journal of knowledge and learning . 2006,第1a2期

机译：一种基于文本挖掘的语义网学习内容自动构建方法
4. An Automatic Email Management Approach Using Data Mining Techniques [C] . Gunjan Soni, C.I. Ezeife International conference on data warehousing and knowledge discovery . 2013

机译：使用数据挖掘技术的自动电子邮件管理方法
5. Meaning for the masses: Theory and applications for Semantic Web and Semantic email systems. [D] . McDowell, Luke K. 2004

机译：大众的意义：语义Web和语义电子邮件系统的理论和应用。
6. Automatic extraction of semantic relations between medical entities: a rule based approach [O] . Asma Ben Abacha, Pierre Zweigenbaum 2011

机译：自动提取医疗实体之间的语义关系：基于规则的方法
7. An automatic email mining approach using semantic non-parametric K-Means++ clustering [O] . Soni Gunjan 2013

机译：使用语义非参数K-Means ++聚类的自动电子邮件挖掘方法
8. Science and Technology Text Mining: Origins of Database Tomography and Multi-Word Phrase Clustering. [R] . Kostoff, R. N. 2003

机译：科技文本挖掘：数据库层析成像和多词短语聚类的起源。

An automatic email mining approach using semantic non-parametric K-Means++ clustering.

摘要

著录项

相似文献

相关主题

期刊订阅