본문 바로가기
  • Home

Severity-based Software Quality Prediction using Class Imbalanced Data

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2016, 21(4), pp.73-80
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

Hong Euyseok 1 박미경 1

1성신여자대학교

Accredited

ABSTRACT

Most fault prediction models have class imbalance problems because training data usually contains much more non-fault class modules than fault class ones. This imbalanced distribution makes it difficult for the models to learn the minor class module data. Data imbalance is much higher when severity-based fault prediction is used. This is because high severity fault modules is a smaller subset of the fault modules. In this paper, we propose severity-based models to solve these problems using the three sampling methods, Resample, SpreadSubSample and SMOTE. Empirical results show that Resample method has typical over-fit problems, and SpreadSubSample method cannot enhance the prediction performance of the models. Unlike two methods, SMOTE method shows good performance in terms of AUC and FNR values. Especially J48 decision tree model using SMOTE outperforms other prediction models.

Citation status

* References for papers published after 2022 are currently being built.

This paper was written with support from the National Research Foundation of Korea.