...
首页> 外文期刊>Empirical Software Engineering >Automated demarcation of requirements in textual specifications: a machine learning-based approach
【24h】

Automated demarcation of requirements in textual specifications: a machine learning-based approach

机译:文本规格中的自动划分要求:基于机器学习的方法

获取原文
获取原文并翻译 | 示例
           

摘要

A simple but important task during the analysis of a textual requirements specification is to determine which statements in the specification represent requirements. In principle,by following suitable writing and markup conventions,one can provide an immediate and unequivocal demarcation of requirements at the time a specification is being developed. However,neither the presence nor a fully accurate enforcement of such conventions is guaranteed. The result is that,in many practical situations,analysts end up resorting to after-the-fact reviews for sifting requirements from other material in a requirements specification. This is both tedious and time-consuming. We propose an automated approach for demarcating requirements in free-form requirements specifications. The approach,which is based on machine learning,can be applied to a wide variety of specifications in different domains and with different writing styles. We train and evaluate our approach over an independently labeled dataset comprised of 33 industrial requirements specifications. Over this dataset,our approach yields an average precision of 81.2% and an average recall of 95.7%. Compared to simple baselines that demarcate requirements based on the presence of modal verbs and identifiers,our approach leads to an average gain of 16.4% in precision and 25.5% in recall. We collect and analyze expert feedback on the demarcations produced by our approach for industrial requirements specifications. The results indicate that experts find our approach useful and efficient in practice. We developed a prototype tool,named DemaRQ,in support of our approach. To facilitate replication,we make available to the research community this prototype tool alongside the non-proprietary portion of our training data.
机译:在分析文本要求规范期间简单但重要的任务是确定规范中的哪些陈述代表要求。原则上,通过遵循合适的写作和标记公约,可以在开发规范时提供立即和明确的要求划分。然而,保证了这种公约的存在也不完全准确执行这些公约。结果是,在许多实际情况下,分析师最终诉诸于事实上对要求规范的其他材料的筛选要求。这既繁琐又耗时。我们提出了一种自动划分的自由式需求规范的自动化方法。该方法基于机器学习,可以应用于不同域中的各种规格和不同的写作风格。我们通过独立标记的数据集培训并评估我们的方法,包括33个工业需求规范。在这个数据集上,我们的方法产生了81.2%的平均精度,平均召回量为95.7%。与基于模态动词和标识符的存在的简单基线相比,我们的方法导致平均收益的精度为16.4%,召回25.5%。我们收集并分析我们对工业需求规范的方法产生的划分的专家反馈。结果表明,专家在实践中发现了我们的方法有用和有效。我们开发了一个名为demarq的原型工具,以支持我们的方法。为了促进复制,我们可以向研究群体提供此原型工具,与我们的培训数据的非专有部分一起。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号