首页> 外文会议>IEEE Global Communications Conference >DNS Typo-Squatting Domain Detection: A Data Analytics Machine Learning Based Approach
【24h】

DNS Typo-Squatting Domain Detection: A Data Analytics Machine Learning Based Approach

机译:DNS域名抢注域名检测:一种基于数据分析和机器学习的方法

获取原文

摘要

Domain Name System (DNS) is a crucial component of current IP-based networks as it is the standard mechanism for name to IP resolution. However, due to its lack of data integrity and origin authentication processes, it is vulnerable to a variety of attacks. One such attack is Typosquatting. Detecting this attack is particularly important as it can be a threat to corporate secrets and can be used to steal information or commit fraud. In this paper, a machine learning-based approach is proposed to tackle the typosquatting vulnerability. To that end, exploratory data analytics is first used to better understand the trends observed in eight domain name-based extracted features. Furthermore, a majority voting-based ensemble learning classifier built using five classification algorithms is proposed that can detect suspicious domains with high accuracy. Moreover, the observed trends are validated by studying the same features in an unlabeled dataset using K-means clustering algorithm and through applying the developed ensemble learning classifier. Results show that legitimate domains have a smaller domain name length and fewer unique characters. Moreover, the developed ensemble learning classifier performs better in terms of accuracy, precision, and F-score. Furthermore, it is shown that similar trends are observed when clustering is used. However, the number of domains identified as potentially suspicious is high. Hence, the ensemble learning classifier is applied with results showing that the number of domains identified as potentially suspicious is reduced by almost a factor of five while still maintaining the same trends in terms of features' statistics.
机译:域名系统(DNS)是当前基于IP的网络的重要组成部分,因为它是名称到IP解析的标准机制。但是,由于缺乏数据完整性和原始身份验证过程,因此容易受到各种攻击。此类攻击之一就是抢注。检测到这种攻击尤为重要,因为它可能威胁到公司机密,并且可以用来窃取信息或进行欺诈。在本文中,提出了一种基于机器学习的方法来解决域名抢注漏洞。为此,探索性数据分析首先用于更好地了解在八个基于域名的提取功能中观察到的趋势。此外,提出了一种使用五种分类算法构建的基于多数投票的集成学习分类器,该分类器可以高精度地检测可疑域。此外,通过使用K-means聚类算法研究未标记数据集中的相同特征并应用开发的集成学习分类器,可以验证观察到的趋势。结果表明,合法域的域名长度较小,唯一字符较少。此外,已开发的集成学习分类器在准确性,准确性和F分数方面表现更好。此外,显示了当使用聚类时观察到相似的趋势。但是,被识别为潜在可疑的域数量很多。因此,对集合学习分类器应用的结果表明,被识别为潜在可疑域的数量几乎减少了五倍,同时在特征统计方面仍保持相同的趋势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号