본문 바로가기
  • Home

A study on BCCWJ as a corpus of written Japanese:

  • The Japanese Language Association of Korea
  • Abbr : JLAK
  • 2020, (65), pp.81-96
  • DOI : 10.14817/jlak.2020.65.81
  • Publisher : The Japanese Language Association Of Korea
  • Research Area : Humanities > Japanese Language and Literature
  • Received : June 26, 2020
  • Accepted : August 21, 2020
  • Published : September 20, 2020

EO SOOJEONG 1

1계명대학교

Accredited

ABSTRACT

The Balanced Corpus of Contemporary Written Japanese (BCCWJ) provided by the National Institute for Japanese Language and Linguistics is one of the most frequently used corpora for studying written Japanese. However, it is quite controversial to consider all 13 registers of BCCWJ as a general corpus of written Japanese. This paper overviews the BCCWJ, and clarifies the characteristic of each register by analyzing the frequency of appearance of ten words and expressions as either spoken Japanese or written Japanese. Results are as follows. First, BCCWJ is a Japanese corpus with a strong characteristic of written Japanese, in which written Japanese appears more frequently than spoken Japanese in a ratio of 2:1. Second, the usage rate of spoken Japanese and written Japanese varies greatly depending on the register, and unlike other registers, blog registers use more spoken Japanese than written Japanese. Third, the ratio of spoken Japanese and written Japanese varies greatly depending on the word, regardless the part of speech (adverb, adjective, conjunction) of the word. Based on the results of these analyses, this paper argues that it is somewhat problematic to regard BCCWJ as a corpus of written Japanese as a whole without considering the characteristic of each register. In addition, these results show that it is necessary to pay sufficient attention to the selection of the analysis object when analyzing written Japanese using BCCWJ.

Citation status

* References for papers published after 2023 are currently being built.