On the Reduction of Verbose Queries in Text Retrieval Based Software Maintenance

Oscar Chaparro and Andrian Marcus

Proceedings of the 38th IEEE/ACM International Conference on Software Engineering (ICSE'16), NIER track, pp. 716–718, 2016

[PDF] [Replication package]

Abstract: We argue that verbose queries used for software retrieval contain many terms that follow specific discourse rules, yet hinder retrieval. We report the results of an empirical study on the effect of removing such terms from verbose queries in the context of Text Retrieval-based concept location. In the study, we remove terms from 424 queries, generated from bug reports of nine open source systems. Removing the terms leads to substantial improvement in retrieval: 73% of the queries are improved, leading to 21.8% and 13.4% gain in terms of MRR and MAP, respectively. Such improvement is larger than that of many more sophisticated state-of-the art approaches. The results show promise and the future challenge lies with automatically identifying the terms to be removed from the verbose queries.