본문 바로가기
  • Home

A Real-time Head Pose Estimation via YOLO-based Facial Landmark Detection using High-Fidelity Synthetic Data

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2026, 31(1), pp.77~89
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : November 11, 2025
  • Accepted : December 31, 2025
  • Published : January 30, 2026

Un-Yong Kim 1 Sungkuk Chun 2 Jeongrok Yun 2 Ju-Yeong Park 2 Sung-Hoon Hong 3 Hoe-Min Kim 2

1한국광기술원,전남대학교
2한국광기술원
3전남대학교

Accredited

ABSTRACT

Data collection constraints and the Sim2Real gap are primary challenges in developing head pose estimation systems. This study adopts geometric landmark information, which is robust against visual noise, as a core feature to bridge this gap. The methodology consists of a two-stage pipeline. First, a large-scale synthetic dataset is constructed using Unreal Engine and MetaHuman. Second, the YOLOv11-pose model is trained as a facial landmark detector using a mixture of synthetic data and the real-world BIWI dataset. The system then estimates the three-axis angles—Roll, Pitch, and Yaw—in real-time based on the detected landmark coordinates. In evaluations using the BIWI dataset, the model achieved a low Mean Absolute Error (MAE) of 1.00° in the near-frontal region. Furthermore, the final system ensured a real-time processing speed of 21.2 FPS in a webcam environment. In conclusion, the integration of synthetic and real data with a landmark-based approach demonstrates the feasibility of precise, real-time head pose estimation.

Citation status

* References for papers published after 2024 are currently being built.