본문 바로가기
  • Home

Application of Automatic Evaluation to Human Translation

  • The Journal of Translation Studies
  • Abbr : JTS
  • 2020, 21(1), pp.9-29
  • DOI : 10.15749/jts.2020.21.1.001
  • Publisher : The Korean Association for Translation Studies
  • Research Area : Humanities > Interpretation and Translation Studies
  • Received : February 10, 2020
  • Accepted : March 11, 2020
  • Published : March 31, 2020

Hyeyeon Chung 1 Kim, Bo-young 1 Kim, Yeon-joo 1 Seo, Seung-hee 1 Song, Shin-ae 1 Lee, Jin-Hyun 1 Jeon Kyoung-ah 1 CHOI JISOO 1 Hong, Seung-bin 1 Heo TakSung 2

1한국외국어대학교
2한림대학교

Accredited

ABSTRACT

This paper deals with two questions. The first is whether BLEU and METEOR, which were developed to evaluate machine translation, can also be used for the evaluation of human translations. The second is, how can these systems be adapted to evaluate human translations in a more variable way? These questions can be subdivided into the following questions: (1) What is the more variable evaluation system, BLEU or METEOR? (2) Is the present precision-recall ratio appropriate? (3) Between the grades and ranks of automatic evaluation, which correlates better with human evaluation? Five translator trainees majored respectively in Arabic, German, English, Japanese, and Spanish (a total of 25 students), translated four texts into Korean (a total of 100 texts). The translations were evaluated by two professional translators in each language and their evaluation results were compared with the outcome of the automatic evaluation. The results showed that the METEOR, recall and ranks correlated with the human ratings better than the BLEU, precision and scores. This and other findings from this experiment suggest that with the minimum number of ca. 12 translations, METEOR can be used at least when determining the order of student performance.

Citation status

* References for papers published after 2023 are currently being built.