본문 바로가기
  • Home

Implementation of Git's Commit Message Classification Model Using GPT-Linked Source Change Data

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2023, 28(10), pp.123-132
  • DOI : 10.9708/jksci.2023.28.10.123
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : September 1, 2023
  • Accepted : October 12, 2023
  • Published : October 31, 2023

Ji-Hoon Choi 1 Kim Jae Woong 2 Seong-Hyun Park 3

1공주대학교 컴퓨터공학과
2공주대학교
3공주대학교 컴퓨터공학과 박사과정

Accredited

ABSTRACT

Git's commit messages manage the history of source changes during project progress or operation. By utilizing this historical data, project risks and project status can be identified, thereby reducing costs and improving time efficiency. A lot of research related to this is in progress, and among these research areas, there is research that classifies commit messages as a type of software maintenance. Among published studies, the maximum classification accuracy is reported to be 95%. In this paper, we began research with the purpose of utilizing solutions using the commit classification model, and conducted research to remove the limitation that the model with the highest accuracy among existing studies can only be applied to programs written in the JAVA language. To this end, we designed and implemented an additional step to standardize source change data into natural language using GPT. This text explains the process of extracting commit messages and source change data from Git, standardizing the source change data with GPT, and the learning process using the DistilBERT model. As a result of verification, an accuracy of 91% was measured. The proposed model was implemented and verified to ensure accuracy and to be able to classify without being dependent on a specific program. In the future, we plan to study a classification model using Bard and a management tool model helpful to the project using the proposed classification model.

Citation status

* References for papers published after 2023 are currently being built.