首页> 外文会议>IEEE conference on computer communications >Do we need a perfect ground-truth for benchmarking Internet traffic classifiers?
【24h】

Do we need a perfect ground-truth for benchmarking Internet traffic classifiers?

机译:我们是否需要一个完善的基础来对Internet流量分类器进行基准测试?

获取原文

摘要

The classification of Internet traffic using supervised or semi-supervised statistical learning techniques, both for anomaly detection and identification of Internet applications, has been impaired by difficulties in obtaining a reliable ground-truth, required both to train the classifier and to evaluate its performance. A perfect ground-truth is increasingly difficult, or sometimes impossible, to obtain due to the growing percentage of cyphered traffic, the sophistication of network attacks, and the constant updates of Internet applications. In this paper, we study the impact of the ground-truth on training the classifier and estimating its performance measures. We show both theoretically and through simulation that ground-truth imperfections can severely bias the performance estimates. We then propose a latent class model that overcomes this problem by combining estimates of several classifiers over the same dataset. The model is evaluated using a high-quality dataset that includes the most representative Internet applications and network attacks. The results show that our latent class model produces very good performance estimates under mild levels of ground-truth imperfection, and can thus be used to correctly benchmark Internet traffic classifiers when only an imperfect ground-truth is available.
机译:使用监督或半监督统计学习技术对Internet流量进行分类(用于异常检测和识别Internet应用程序),由于难以获得可靠的事实真相而受到损害,培训分类器和评估其性能均需要困难。由于越来越多的加密流量,网络攻击的复杂性以及Internet应用程序的不断更新,越来越难以获得一个完美的事实,有时甚至是不可能。在本文中,我们研究了真实性对训练分类器并评估其性能指标的影响。我们在理论上和通过仿真都表明,真实的缺陷会严重影响性能估计。然后,我们提出了一个潜在类模型,该模型通过组合同一数据集上多个分类器的估计值来克服此问题。使用包括最具代表性的Internet应用程序和网络攻击的高质量数据集对模型进行评估。结果表明,我们的潜在类模型在温和的地面真实性缺陷水平下产生了非常好的性能估计,因此,当只有不完整的地面真实性可用时,可以用于正确地基准化Internet流量分类器。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号