Structural information based term weighting in text retrieval for feature location

机译：特征位置文本检索中基于结构信息的术语加权

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Many recent feature location techniques (FLTs) apply text retrieval (TR) techniques to corpora built from text embedded in source code. Term weighting is a standard preprocessing step in TR and is used to adjust the importance of a term within a document or corpus. Common term weighting schemes such as tf-idf may not be optimal for use with source code, because they originate from a natural language context and were designed for use with unstructured documents. In this paper we propose a new approach to term weighting in which term weights are assigned using the structural information from the source code. We then evaluate the proposed approach by conducting an empirical study of a TR-based FLT. In all, we study over 400 bugs and features from five open source Java systems and find that structural term weighting can cause a statistically significant improvement in the accuracy of the FLT.

机译：许多最近的特征定位技术（FLTS）将文本检索（TR）技术应用于从源代码中的文本内置的语料库。术语加权是TR中的标准预处理步骤，用于调整文档或语料库内的术语的重要性。诸如TF-IDF的共同术语加权方案可能与源代码一起使用可能不是最佳的，因为它们来自自然语言上下文，并且设计用于非结构化文档。在本文中，我们提出了一种新的方法来加权，其中使用来自源代码的结构信息分配术语权重。然后，我们通过进行TR基FLT的实证研究来评估所提出的方法。总而言之，我们研究了来自五个开源Java系统的400多个错误和功能，并发现结构术语加权可能导致FLT的精度造成统计上显着的改进。

著录项

来源
《International Conference on Program Comprehension》|2013年||共9页
会议地点
作者
Bassett Blake; Kraft Nicholas A.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类软件工程;
关键词
Program comprehension; feature location; latent Dirichlet allocation; static analysis; text retrieval;

机译：计划理解;特征位置;潜在的dirichlet分配;静态分析;文本检索;

相似文献

外文文献
中文文献
专利

1. Term Weighting Schemes Experiment Based on SVD for Malay Text Retrieval [J] . Nordianah Ab Samat, Masrah Azrifah Azmi Murad, Muhamad Taufik Abdullah, International journal of computer science and network security . 2008,第10期

机译：基于SVD的马来文本检索词加权方案实验
2. A noun-based approach to feature location using time-aware term-weighting [J] . Sima Zamani, Sai Peck Lee, Ramin Shokripour, Information and software technology . 2014,第8期

机译：基于名词的使用时间感知术语加权的特征定位方法
3. Supporting Text Retrieval by Typographical Term Weighting [J] . Werner Lars, B鰐tcher Stefan International Journal of Intelligent Information Technologies . 2007,第2期

机译：通过印刷术语权重支持文本检索
4. Structural information based term weighting in text retrieval for feature location [C] . Bassett Blake, Kraft Nicholas A. International Conference on Program Comprehension . 2013

机译：基于结构信息的术语权重在文本检索中进行特征定位
5. Structural information based term weighting in text retrieval for feature location [D] . Bassett, Richard B. 2013

机译：基于结构信息的术语权重在文本检索中进行特征定位
6. Relevance popularity: A term event model based feature selection scheme for text classification [O] . Guozhong Feng, Baiguo An, Fengqin Yang, -1

机译：相关性流行度：基于术语事件模型的文本分类特征选择方案
7. Term-weighting approaches in automatic text retrieval [O] . Gerard Salton, Christopher Buckley 1988

机译：自动文本检索中的术语加权方法
8. Improve Precategorized Collection Retrieval by Using Supervised Term Weighting Schemes. [R] . Zhao, Y., Karypis, G. 2001

机译：利用监督期限加权方案改进预分类收集检索。

Structural information based term weighting in text retrieval for feature location

摘要

著录项

相似文献

相关主题

期刊订阅