본문 바로가기
  • Home

A Study on the Contents of the Japanese Language Proficiency Test Using Text Mining Analysis - Focusing on JLPT N3 Vocabulary-

  • Journal of Japanese Culture
  • 2019, (82), pp.5-25
  • DOI : 10.21481/jbunka..82.201908.5
  • Publisher : The Japanese Culture Association Of Korea (Jcak)
  • Research Area : Humanities > Japanese Language and Literature
  • Received : June 30, 2019
  • Accepted : August 5, 2019
  • Published : August 31, 2019

Leeyouhee 1

1대전대학교

Accredited

ABSTRACT

The objectives of this study were to : (1) extract keywords of the previous N3 test characters and vocabulary (2010-2018), and (2) analyze exam patterns and trends by utilizing Python3.7 program, Jupiter Notebook tool, and Janome(Mecab) stemming engine, based on text mining analysis. A total of 131 words were extracted twice or more frequently in N3. The top key word was ‘すぎる’(5 times) and ‘的’(4times) and ‘疲れる’, ‘断る’, ‘怒る’, ‘出張’, ‘規則’, ‘そっくりだ’(3times) were used in most of the exam. In terms of sections, in Question 1, the most frequently appearing words were ‘得意’, ‘改札’, ‘卒業’, ‘過去’, ‘到着’, ‘努力’, ‘表す’(2times). In Question 2, seven words ‘帰宅’, ‘週刊紙’, ‘現在’, ‘楽器’, ‘成績’, ‘記録’, ‘逃げる’(2times) appeared and ‘うっかり’(3times), ‘想像’, ‘目標’, ‘期待’, ‘自動的’, ‘迷う’, ‘しっかり’, ‘そっくり’, ‘うわさ’, ‘キャンセル’, ‘リサイクル’ (2times) appeared in Question 3. In Question 4, the word ‘すぎる’ appeared intensively and the similar words ‘(すぐ)怒る=短気だ’, ‘くたびれる=疲れる’, ‘大変だ=きつい’(2times) were set. In Question 5, ‘空’, ‘活動’, ‘性格’, ‘募集’, ‘修理’, ‘断る’, ‘身につける’ (2 times) appeared as frequent words. These keywords have characteristics that are appropriate for the question type of each part and are patterned, so it is presumed that they are likely to be asked in the future. With respect to the parts of speech, the frequency ratio was counted by the order of Noun(49.7%)-Verb (26.9%)-Adjective(6.1%)-Adjective verb (5.3%)-Adverb(5.3%)-Katakana(3.9%)-Suffix(2.0%)- Prenominal adjective (0.5%)-Prefix(0.3%). As a result, it is proposed to study, in order of the importance, the following parts of speech : (1) 131 high-frequency vocabularies, (2) two Kanji nouns, (3) verbs, (4) adjectives, adjective verbs, adverbs, and (5) other Katakana, suffixes, prenominal adjectives, and prefixes.

Citation status

* References for papers published after 2022 are currently being built.