Mucommander text overlap

#Mucommander text overlap manual
#Mucommander text overlap software

Sixty-nine percent of these novel FLTs are evaluated through standard empirical methods but, of those, only 9% use baseline technique(s) in their evaluations to allow cross comparison with other techniques. Results of the systematic review indicate that 95% of the articles studied are directed towards novelty, in that they propose a novel FLT.

This article presents a systematic review of 170 FLT articles, published between the years 20. As FLTs evolve and more novel FLTs are introduced, it is important to perform comparison studies to investigate “Which are the best FLTs?” However, an initial reading of the literature suggests that performing such comparisons would be an arduous process, based on the large number of techniques to be compared, the heterogeneous nature of the empirical designs, and the lack of transparency in the literature.

#Mucommander text overlap software

It plays a key role in many software maintenance tasks and a wide variety of Feature Location Techniques (FLTs), which rely on source code structure or textual analysis, have been proposed by researchers. Finally, the results suggest that the performance of FLTs partially depends on system/benchmark characteristics, in addition to the FLTs themselves.įeature location (FL) is the task of finding the source code that implements a specific, user-observable functionality in a software system.

By presenting the relative performances of baseline techniques this paper facilitates empirical cross-comparison of existing and future FLTs. Results of the case studies suggest that different baseline techniques perform differently and that VSM-Lucene and LSI-Matlab performed better than other implementations. These baseline techniques are assessed in twelve case studies to rank their performance. This paper moves towards standardizing FLT comparability by assessing eight baseline techniques in an empirical design that addresses these confounding factors. But evaluation across FLTs is confounded by empirical designs that incorporate different FL goals and evaluation criteria. In order to relate the performance of FLTs compared against different baseline techniques, these compare-to techniques should be evaluated against each other. To compare FLTs, an open, standard set of non-subjective, reproducible “compare-to” FLT techniques (baseline techniques) should be used for evaluation. Considering its key role in software maintenance, a vast array of automated and semi-automated Feature Location Techniques (FLTs) have been proposed.

#Mucommander text overlap manual

We found that (1) LDA and LDA-based techniques are the most frequent topic modeling techniques, (2) developer communication and bug reports have been modelled most, (3) data pre-processing and modeling parameters vary quite a bit and are often vaguely reported, and (4) manual topic naming (such as deducting names based on frequent words in a topic) is common.įeature Location (FL) aims to locate observable functionalities in source code.

We analyzed topic modeling as applied in 111 papers from ten highly-ranked software engineering venues (five journals and five conferences) published between 20. Our study aims at describing how topic modeling has been applied in software engineering research with a focus on four aspects: (1) which topic models and modeling techniques have been applied, (2) which textual inputs have been used for topic modeling, (3) how textual data was “prepared” (i.e., pre-processed) for topic modeling, and (4) how generated topics (i.e., word clusters) were named to give them a human-understandable meaning. Topic modeling needs to be applied carefully (e.g., depending on the type of textual data analyzed and modeling parameters). In software engineering, topic modeling has been used to analyze textual data in empirical studies (e.g., to find out what developers talk about online), but also to build new techniques to support software engineering tasks (e.g., to support source code comprehension). Topic modeling using models such as Latent Dirichlet Allocation (LDA) is a text mining technique to extract human-readable semantic “topics” (i.e., word clusters) from a corpus of textual documents.