Paper Title
Data Mining Behavioral Approach To Reduce The Data Set For Debugging

Software companies spend most of cost in dealing for software bugs. Software bugs are unavoidable and fixing bugs is an expensive task. Automatic bug triage is applied by using text classification techniques to reduce the time cost in manual work. In this paper, we label the issue of data reduction for bug triage, i.e., to reduce the scale of bug report and improve its quality. To reduce the large data on bug dimension and the word dimension, we simultaneously combine instance selection and feature selection techniques. For applying instance selection and feature selection, we take out the attributes from historical bug data set and then for this new data set we build a predicative model that is to determine the data reduction orders for bug triage. We examine the performance of data reduction on bug reports of two large open source projects, such as Eclipse and Mozilla. The results of the data reduction techniques show that, the data scale will reduce effectively and accuracy of bug triage is improved. Index Terms— Bug triage, bug data reduction, instance selection, feature selection, software repositories.