Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups (디지털 취약 계층의 디지털 사이니지 사용성 향상을 위한 음성-영상 기반 UI 파이프라인 설계)

Jeong-Hyun Kim (김정현); Jong-Seob Park (박종섭); Chol Yong Soo (최용수)

@article{ART003277371},
author={Jeong-Hyun Kim and Jong-Seob Park and Chol Yong Soo},
title={Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups},
journal={ Journal of Software Forensics},
issn={2092-8114},
year={2025},
volume={21},
number={4},
pages={197-209}

TY - JOUR
AU - Jeong-Hyun Kim
AU - Jong-Seob Park
AU - Chol Yong Soo
TI - Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups
JO - Journal of Software Forensics
PY - 2025
VL - 21
IS - 4
PB - Korea Software Assessment and Valuation Society
SP - 197
EP - 209
SN - 2092-8114
AB - This paper proposes a speech-image UI pipeline to improve the usability of digital signage for the mobility-challenged. It consists of two main stages: visual speech detection and auditory signal enhancement. First, the image processing module utilizes Google MediaPipe Holistic and a MobileNetV2-GRU hybrid model to analyze the user's lip movements and capture "speech intent" in real time. Specifically, data augmentation and a circular buffer prevent early speech loss, achieving 100% speech detection accuracy. Second, the speech processing module adopts an adaptive signal-to-noise ratio (SNR) noise removal algorithm based on the "Do No Harm" principle. To solve the problem of deep learning models (Sepformer) distorting Korean speech signals, the SNR threshold is set to 6 dB. It is remove noise in case of low-SNR environments and skip processing in case of high-SNR environments to prevent speech loss. In particular, if it omit the speech processing of 79.4% of the total speech data in a low-SNR environment, 1.07% decrease in WER will be achieved. Furthermore, applying data augmentation techniques in visual speech detection significantly achieve the accuracy of 1.0(100%) and the loss of 0.0004.
KW - Digital Signage;Speech-Image;Noise Cancellation;UI Pipeline;WER
DO -
UR -
ER -

Jeong-Hyun Kim, Jong-Seob Park and Chol Yong Soo. (2025). Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups. Journal of Software Forensics, 21(4), 197-209.

Jeong-Hyun Kim, Jong-Seob Park and Chol Yong Soo. 2025, "Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups", Journal of Software Forensics, vol.21, no.4 pp.197-209.

Jeong-Hyun Kim, Jong-Seob Park, Chol Yong Soo "Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups" Journal of Software Forensics 21.4 pp.197-209 (2025) : 197.

Jeong-Hyun Kim, Jong-Seob Park, Chol Yong Soo. Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups. 2025; 21(4), 197-209.

Jeong-Hyun Kim, Jong-Seob Park and Chol Yong Soo. "Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups" Journal of Software Forensics 21, no.4 (2025) : 197-209.

Jeong-Hyun Kim; Jong-Seob Park; Chol Yong Soo. Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups. Journal of Software Forensics, 21(4), 197-209.

Jeong-Hyun Kim; Jong-Seob Park; Chol Yong Soo. Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups. Journal of Software Forensics. 2025; 21(4) 197-209.

Jeong-Hyun Kim, Jong-Seob Park, Chol Yong Soo. Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups. 2025; 21(4), 197-209.

Jeong-Hyun Kim, Jong-Seob Park and Chol Yong Soo. "Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups" Journal of Software Forensics 21, no.4 (2025) : 197-209.

KJCKorea
Journal Central

Journal of Software Forensics 2024 KCI Impact Factor : 0.32

Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of Software Forensics 2024 KCI Impact Factor : 0.32

Designing a Voice-Video Based UI Pipeline to Improve Digital Signage Usability for Digitally Vulnerable Groups

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (0) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.