본문 바로가기
  • Home

Implementation of a Multimodal LLM-based Psychological Counseling AI System

  • Journal of Internet of Things and Convergence
  • Abbr : JKIOTS
  • 2026, 12(2), pp.173~181
  • Publisher : The Korea Internet of Things Society
  • Research Area : Engineering > Computer Science > Internet Information Processing
  • Received : February 1, 2026
  • Accepted : February 23, 2026
  • Published : April 30, 2026

Jun-Yong Park 1 Gyun-Ho Kim 1 Kang-Rae Jo 1 KIM, TAEKOOK 1

1국립부경대학교

Accredited

ABSTRACT

This study proposes and empirically evaluates a multimodal emotion recognition-based AI psychological counseling system that integrates speech, text, and facial expressions, aiming to address the rapidly increasing demand for mental health services in modern society and the limitations of conventional text-based counseling chatbots in capturing nonverbal cues. From a technical perspective, Wav2Vec2 was employed for Korean speech recognition, KoBERT for textual context analysis, and ResNet18 for real-time facial expression recognition, each serving as the backbone model for its respective modality. The emotion probability distributions generated from each model were integrated using a decision-level late fusion approach. Experimental results demonstrate that the proposed system improves the consistency and accuracy of emotional state recognition compared to single-modality approaches, while also significantly enhancing the understanding of counseling context. In particular, the integrated emotion analysis combining speech, text, and facial expression data enables more precise reflection of users’ emotional changes during the counseling process, thereby confirming its potential as a supportive tool for assisting counselors in decision-making.

Citation status

* References for papers published after 2024 are currently being built.