首页> 外文会议>IEEE International Conference on Data Mining Workshops >'The 100 Most Influential Persons in History': A Data Mining Perspective
【24h】

'The 100 Most Influential Persons in History': A Data Mining Perspective

机译:“历史上100人最有影响力的人”:一种数据挖掘观点

获取原文

摘要

Data mining has been widely applied in various domains, however, there have been limited studies into discovering hidden knowledge from factual data about selected groups of people with special characteristics. It is important to mine data about such group of individuals to extract insightful knowledge that could lead to a better understanding of their personalities, in addition to further sociological conclusions. This paper presents the application and outcome of data mining techniques, namely data clustering and association rules extraction, to find common features and relations among social, environmental and socioeconomic factors from the lives of known influential individuals in history. The mining process was initiated by constructing a dataset through defining, extracting, and retrieving important known facts about these individuals from selected and reliable sources. Second, association rules discovery algorithms were applied in order to show interesting patterns and highlight relations between attributes. Finally, the data were clustered into different groups and each cluster was further analyzed to identify its most strongly defining attributes. The extracted association rules showed how some factors are related, such as the effect of environment type and order of birth in the family on the age at which the individual first engaged with their domain of influence. The clustering exercise demonstrated that influential people who grew up in families of a similar size and financial status share many similar characteristics.
机译:数据挖掘已广泛应用于各个领域,然而,研究了从有关特殊特征的所选人群的事实数据中发现隐藏知识的有限研究。除了进一步的社会学结论之外,还可以提取有关这些人的数据,以提取可能导致他们的个性化更好地了解其个性的洞察力。本文介绍了数据挖掘技术的应用和结果,即数据集群和关联规则提取,寻求社会,环境和社会经济因素的共同特征和关系,从历史上的已知有影响力的人的生命。通过通过定义,提取和检索来自所选和可靠的来源的这些个人的重要已知事实来构造数据集来启动挖掘过程。其次,应用关联规则发现算法以显示有趣的模式并突出属性之间的关系。最后,将数据群集为不同的组,并进一步分析每个群集以识别其最强烈的定义属性。提取的关联规则显示了一些因素有关的关系,例如家庭在个人首次与其域名领域与其领域接触的年龄的家庭类型和生育顺序的影响。聚类练习表明,在相似规模和财务状况的家庭中长大的有影响力的人分享了许多相似的特征。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号