@article{ART002663003},
author={Jo Kyungsun and KANG EUNJIN},
title={A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus},
journal={Korean Semantics},
issn={1226-7198},
year={2020},
volume={70},
pages={221-245},
doi={10.19033/sks.2020.12.70.221}
TY - JOUR
AU - Jo Kyungsun
AU - KANG EUNJIN
TI - A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus
JO - Korean Semantics
PY - 2020
VL - 70
IS - null
PB - The Society Of Korean Semantics
SP - 221
EP - 245
SN - 1226-7198
AB - In this paper, the lexical characteristics and speech acts characteristics appearing in interactive corpus built for artificial intelligence learning were analyzed. Corpus was classified by the situation of search and reservation. As lexicon characteristics, the degree of lexicon density and lexicon diversity was investigated, and as speech act characteristics, the frequency of direct and indirect speech act was analyzed. As a result of the analysis, First, the hypothesis of lexicon density that search and reservation corpus is related to content words and function words was accepted without being rejected according to the results of the Chi test. Second, we calculated TTR and GI to understand lexicon diversity, and the GI value of the search situation was higher than the reservation situation, indicating that more diverse vocabulary was used in the search situation. Third, search and reservation corpus had significant differences in frequency of direct and indirect speech. The study can reveal the characteristics of language expressions that humans use to communicate with artificial intelligence. In addition, the results of this study could contribute to the composition of the principles and guidelines for building an efficient and balanced corpus for artificial intelligence learning.
KW - artificial intelligence learning corpus;interactive corpus;lexical characteristics;speech acts characteristics;lexicon density;lexicon diversity;direct speech acts;indirect speech acts;type-token ratio;Guiraud index;Chi test
DO - 10.19033/sks.2020.12.70.221
ER -
Jo Kyungsun and KANG EUNJIN. (2020). A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus. Korean Semantics, 70, 221-245.
Jo Kyungsun and KANG EUNJIN. 2020, "A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus", Korean Semantics, vol.70, pp.221-245. Available from: doi:10.19033/sks.2020.12.70.221
Jo Kyungsun, KANG EUNJIN "A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus" Korean Semantics 70 pp.221-245 (2020) : 221.
Jo Kyungsun, KANG EUNJIN. A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus. 2020; 70 221-245. Available from: doi:10.19033/sks.2020.12.70.221
Jo Kyungsun and KANG EUNJIN. "A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus" Korean Semantics 70(2020) : 221-245.doi: 10.19033/sks.2020.12.70.221
Jo Kyungsun; KANG EUNJIN. A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus. Korean Semantics, 70, 221-245. doi: 10.19033/sks.2020.12.70.221
Jo Kyungsun; KANG EUNJIN. A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus. Korean Semantics. 2020; 70 221-245. doi: 10.19033/sks.2020.12.70.221
Jo Kyungsun, KANG EUNJIN. A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus. 2020; 70 221-245. Available from: doi:10.19033/sks.2020.12.70.221
Jo Kyungsun and KANG EUNJIN. "A quantitative study on lexicon and speech acts characteristics in dialogue corpus for the application of artificial intelligence learning corpus" Korean Semantics 70(2020) : 221-245.doi: 10.19033/sks.2020.12.70.221