...
首页> 外文期刊>Information and software technology >Boundary sampling to boost mutation testing for deep learning models
【24h】

Boundary sampling to boost mutation testing for deep learning models

机译:边界采样,为深学习模型提升突变检测

获取原文
获取原文并翻译 | 示例
           

摘要

Context: The prevalent application of Deep Learning (DL) models has raised concerns about their reliability. Due to the data-driven programming paradigm, the quality of test datasets is extremely important to gain accurate assessment of DL models. Recently, researchers have introduced mutation testing into DL testing, which applies mutation operators to generate mutants from DL models, and observes whether the test data can identify mutants to check the quality of test dataset. However, there still exist many factors (e.g., huge labeling efforts and high running cost) hindering the implementation of mutation testing for DL models.Objective: We desire for an approach to selecting a smaller, sensitive, representative and efficient subset of the whole test dataset to promote the current mutation testing (e.g., reduce labeling and running cost) for DL Models.Method: We propose boundary sample selection (BSS), which employs the distance of samples to decision boundary of DL models as the indicator to construct the appropriate subset. To evaluate the performance of BSS, we conduct an extensive empirical study with two widely-used datasets, three popular DL models, and 14 up-to-date DL mutation operators. Results: We observe that (1) The sizes of our subsets generated by BSS are much smaller (about 3%-20% of the whole test set). (2) Under most mutation operators, our subsets are superior (about 9.94-21.63) than the whole test sets in observing mutation effects. (3) Our subsets could replace the whole test sets to a very high degree (higher than 97%) when considering mutation score. (4) The MRR values of our proposed subsets are clearly better (about 2.28-13.19 times higher) than that of the whole test sets.Conclusions: The result shows that BSS can help testers save labelling cost, run mutation testing quickly and identify killed mutants early.
机译:背景:深度学习(DL)模型的普遍应用提出了对其可靠性的担忧。由于数据驱动的编程范例,测试数据集的质量非常重要,以准确评估DL模型。最近,研究人员已经将突变测试引入DL测试,该DL测试应用突变运算符从DL模型生成突变体,并观察测试数据是否可以识别突变体以检查测试数据集的质量。然而,仍然存在许多因素(例如,巨大的标签努力和高运行成本)阻碍了DL模型的突变测试的实现。目的:我们希望采用方法来选择整个测试的较小,敏感,代表性和有效的子集DataSet推广DL Models的当前突变测试(例如,减少标签和运行成本)。子集。为了评估BSS的性能,我们通过两个广泛使用的数据集,三个流行的DL模型和14个最新的DL突变运算符进行广泛的实证研究。结果:我们观察到(1)BSS生成的我们的子集的大小要小得多(约3%-20%的整个测试集)。 (2)在大多数突变运营商下,我们的子集高于(约9.94-21.63),而不是观察突变效应的整个测试集。 (3)在考虑突变分数时,我们的子集可以将整个测试集更换为高度(高于97%)。 (4)我们提出的子集的MRR值明显更好(比整个测试集的倍率更好).Conclusions:结果表明,BSS可以帮助测试人员节省标签成本,快速运行突变测试并识别杀死突变体早期。

著录项

  • 来源
    《Information and software technology》 |2021年第2期|106413.1-106413.16|共16页
  • 作者单位

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Software Inst Nanjing 210023 Peoples R China;

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Dept Comp Sci & Technol Nanjing 210023 Peoples R China;

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Dept Comp Sci & Technol Nanjing 210023 Peoples R China;

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Dept Comp Sci & Technol Nanjing 210023 Peoples R China;

    Momenta Nantiancheng Rd Suzhou Peoples R China;

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Dept Comp Sci & Technol Nanjing 210023 Peoples R China;

    Nanjing Univ State Key Lab Novel Software Technol Nanjing 210023 Peoples R China|Nanjing Univ Dept Comp Sci & Technol Nanjing 210023 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Software testing; Deep learning; Mutation testing; Boundary; Neural network;

    机译:软件测试;深度学习;突变测试;边界;神经网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号