본문 바로가기
  • Home

A Quantitative Linguistic Analysis of Japanese Vocabulary Usage in the Diaries of the Colonially Educated Generation During Their L2 Acquisition Period

  • The Japanese Language Association of Korea
  • Abbr : JLAK
  • 2026, (88), pp.241~259
  • Publisher : The Japanese Language Association Of Korea
  • Research Area : Humanities > Japanese Language and Literature
  • Received : March 19, 2026
  • Accepted : May 15, 2026
  • Published : June 20, 2026

HWANG, YOUNG HEE 1

1한양사이버대학교

Accredited

ABSTRACT

This study investigates the vocabulary system of the colonially educated Japanese-speaking generation during their L2 acquisition period under Japanese colonial rule in Korea. Previous research on this population has focused predominantly on L2 attrition in later life, leaving the acquisition stage largely unexamined. To address this gap, a diary corpus of approximately 54,000 tokens was constructed from four diaries written between 1932 and 1941 by Korean learners representing three educational strata: elementary school students (P), secondary school students (M), and employed workers (W). Morphological analysis was conducted using MeCab with ipadic, following orthographic normalization from historical to modern Kana spelling. Quantitative indices, including type-token ratio (TTR), lexical category distribution, Kanji ratio, readability (RL), and logicality scores, were computed and compared across strata. The results indicate that TTR increased progressively from 7.0% (P) to 19.2% (W), and the proportion of Kanji vocabulary rose from 8% to approximately 43–56%, suggesting that the public vocabulary system of Imperial Japanese was increasingly internalized through colonial schooling and workplace exposure. Multivariate analysis (MANOVA, Pillai's V=0.74–0.84, p<.001) confirmed statistically significant differences across all 15 lexical variables, with Japanese word ratio(η²=.60) showing the largest effect size. Readability levels ranged from Level 4 (elementary male) to Level 7 (secondary female), corresponding roughly to proficiency levels in Kanji usage. These findings should be interpreted with caution, given the small number of diaries analyzed and the influence of orthographic preprocessing on quantitative indices. Nevertheless, this study enables a quantitative reconstruction of L2 Japanese proficiency at the acquisition stage, extending the longitudinal perspective on the L2 life cycle beyond attrition. Future research should expand the corpus and incorporate comparative data from contemporaneous native speaker diaries.

Citation status

* References for papers published after 2024 are currently being built.