Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Balogun, Abdullateef; Basri, Shuib; Abdulkadir, Said Jadid; Sobri, Ahmad Hashim

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

dc.contributor.author	Balogun, Abdullateef
dc.contributor.author	Basri, Shuib
dc.contributor.author	Abdulkadir, Said Jadid
dc.contributor.author	Sobri, Ahmad Hashim
dc.date.accessioned	2020-01-30T10:41:43Z
dc.date.available	2020-01-30T10:41:43Z
dc.date.issued	2019-07-09
dc.description.abstract	Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance	en_US
dc.identifier.citation	Balogun, A.O.; Basri, S.; Abdulkadir, S.J.; Hashim, A.S. Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach. Appl. Sci. 2019, 9, 2764.	en_US
dc.identifier.issn	2076-3417
dc.identifier.uri	http://hdl.handle.net/123456789/3588
dc.language.iso	en	en_US
dc.publisher	Multidisciplinary Digital Publishing Institute (MDPI)	en_US
dc.relation.ispartofseries	9;13
dc.subject	Machine Learning	en_US
dc.subject	Software Defect Prediction	en_US
dc.subject	Feature Selection	en_US
dc.subject	Software Quality Assurance	en_US
dc.title	Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: applsci-09-02764.pdf
Size:: 2.81 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.69 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Files

Original bundle

License bundle

Collections