본문 바로가기
  • Home

Byte-Level Processing Limits in KSL Translation: A Study of the KoBART–ByT5 Performance Gap

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2026, 31(4), pp.33~44
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : January 26, 2026
  • Accepted : March 31, 2026
  • Published : April 30, 2026

Dong-Hyuk Kim 1 Kyu-Cheol Cho 1

1인하공업전문대학

Accredited

ABSTRACT

This study investigates how input representation granularity affects performance and training behavior in Korean-to-Korean Sign Language(KSL) gloss translation. Using the National Institute of Korean Language Korean-KSL parallel corpus (2022-2024), we compare a token/subword-based pretrained Seq2Seq model (KoBART) with a byte-level model (ByT5). Quantitative results show a decisive advantage for KoBART, which achieves 0.447 METEOR versus 0.1192 (≈275% relative improvement1)). Analyses indicate that ByT5 is constrained by substantially longer effective sequences, which degrades sentence-level generation, whereas KoBART benefits from subword segmentation that effectively performs structural alignment with KSL glosses, demonstrating superior suitability in terms of faithful information reconstruction. These findings provide empirical evidence for the critical role of input granularity design in low-resource KSL gloss translation and establish a robust baseline for future KSL machine translation studies.

Citation status

* References for papers published after 2024 are currently being built.