본문 바로가기
  • Home

Analysis of vocabulary and formulaic expression of Chinese textbooks using corpus analysis tools

  • Journal of Chinese Language and Literature
  • 2020, (83), pp.295-344
  • DOI : 10.15792/clsyn..83.202004.295
  • Publisher : Chinese Literary Society Of Yeong Nam
  • Research Area : Humanities > Chinese Language and Literature
  • Received : March 10, 2020
  • Accepted : April 19, 2020
  • Published : April 30, 2020

Byeong Kwu Kang 1

1서강대학교

Accredited

ABSTRACT

This study examined the linguistic features of Chinese textbooks using quantitative analysis methods used in corpus linguistics. In particular, we focused on finding out the linguistic features of textbooks compiled in Korea compared to textbooks published in China. In this paper we collected data from 35 Chinese textbooks published in Korea and 263 textbooks published in China. And this textbook was divided into three categories. They are texts compiled by Korean authors (type A), texts translated from Chinese textbooks in Korea (type B), and textbooks published in China (type C). First, we investigated the frequency of words and sentences used in Chinese textbooks. According to the analysis result, when looking at the token frequency and type frequency of a word, it has the order of ‘A type <B type <C type’. The average length of sentences varies greatly depending on the textbook type and level. Second, the STTR statistics for each Chinese textbook were examined to compare how various vocabulary words are used. As a result of analysis, the STTR value of the A-type textbook is the lowest, the B-type textbook is the next, and the C-type textbook is the highest. Third, to understand the level of vocabulary use for each textbook, we compared it with the HSK Vocabulary Rating Table. Overall, the proportion of vocabulary grades did not differ significantly by textbook type, but by level. Fourth, keyword analysis was conducted to understand what vocabulary categories are frequently used in Chinese textbooks. According to the results of the analysis, it was found that the keyness of personal pronouns and interrogations in Chinese textbooks was very high. Fifth, formulaic expressions frequently used in Chinese textbooks were extracted. As can be seen from the results of the analysis, there is a combination of words repeatedly used in Chinese textbook text. The formulaic expression needs to be studied important in that it becomes the basic data of Chinese education. Finally, in this paper, cluster analysis was performed based on the linguistic characteristics of Chinese textbooks. According to the results of statistical cluster analysis, Chinese textbooks are divided into five cluster types. The results of this analysis can be used as a reference to study texts in Chinese textbooks in the future.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.