Parameter tuning in KNN for software defect prediction: an empirical analysis

No Thumbnail Available

Date

2019-10-31

Journal Title

Journal ISSN

Volume Title

Publisher

Department of Computer Engineering, Universitas Diponegoro, Indonesia.

Abstract

Software Defect Prediction (SDP) provides insights that can help software teams to allocate their limited resources in developing software systems. It predicts likely defective modules and helps avoid pitfalls that are associated with such modules. However, these insights may be inaccurate and unreliable if parameters of SDP models are not taken into consideration. In this study, the effect of parameter tuning on the k nearest neighbor (k-NN) in SDP was investigated. More specifically, the impact of varying and selecting optimal k value, the influence of distance weighting and the impact of distance functions on k-NN. An experiment was designed to investigate this problem in SDP over 6 software defect datasets. The experimental results revealed that k value should be greater than 1 (default) as the average RMSE values of k-NN when k>1(0.2727) is less than when k=1(default) (0.3296). In addition, the predictive performance of k-NN with distance weighing improved by 8.82% and 1.7% based on AUC and accuracy respectively. In terms of the distance function, kNN models based on Dilca distance function performed better than the Euclidean distance function (default distance function). Hence, we conclude that parameter tuning has a positive effect on the predictive performance of k-NN in SDP.

Description

Keywords

Software Defect Prediction, Machine Learning

Citation

M. A. Mabayoje, A. O. Balogun, H. A. Jibril, J. O. Atoyebi, H. A. Mojeed, and V. E. Adeyemo,"Parameter tuning in KNN for software defect prediction: an empirical analysis," Jurnal Teknologi dan SistemKomputer, vol. 7 no. 4, pp. 121-126, 2019. doi: 10.14710/jtsiskom.7.4.2019.121-126

Collections