본문 바로가기
  • Home

A Comparative Study of Ensemble Learning Models for Predicting Attendance in the KBO League

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2025, 30(4), pp.11~18
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : February 25, 2025
  • Accepted : April 4, 2025
  • Published : April 30, 2025

Tai-Sung Hur 1 Minsuk Oh 1

1인하공업전문대학

Accredited

ABSTRACT

This study developed and analyzed ensemble learning-based prediction models for forecasting attendance in the KBO League. Using KBO League data from 2022 to 2024, we collected variables such as team rankings, winning rates, consecutive wins/losses, search volume, stadiums, and home/away games, with the attendance ratio compared to stadium capacity set as the target variable. In the data preprocessing phase, Monday games were excluded, and the home/away attendance ratio was set to 7:3 to enhance model realism. Among various ensemble models compared, including Linear Regression, Random Forest, XGBoost, and LightGBM, the LightGBM model showed the best performance with an RMSE of 8.39 and R² Score of 0.783. Feature importance analysis revealed that online search volume (28.17%) and winning rate (25.17%) had the most significant impact on attendance, while team (10.57%) and day of the week (9.73%) also showed meaningful influence. Additionally, SHAP (SHapley Additive exPlanations) analysis provided insights into the directional impact of each variable on predictions, particularly revealing that the home/away factor had a stronger influence than expected through interactions with other variables. This study is significant in providing a practical prediction model that can assist KBO teams in establishing attendance strategies and making marketing decisions.

Citation status

* References for papers published after 2023 are currently being built.