This study builds up a methodological pipeline to compare the lexical difference between the South Korean data and the North Korean data using the recent techniques of natural language processing. Assuming that the Chosun-ilbo and Rodong-shinmun are the counterpart of each other, we created the word embedding models to compare the distributional properties. We collected the data published in the newspapers from 2015 to 2017, preprocessed the texts, and then ran the Word2Vec library with the manipulated data. Classifying the patterns of lexical difference into four subtypes, we delved into how the two Korean languages have been lexically divergent.
[journal]
Bojanowski, P.
/ 2017
/ Enriching Word Vectors with Subword Information
/ Transactions of the Association for Computational Linguistics
5
: 135~146
[journal]
Boleda, G.
/ 2020
/ Distributional Semantics and Linguistic Theory
/ Annual Review of Linguistics
6
: 213~234
[other]
Mikolov, T.
/ 2013
/ Efficient estimation of word representations in vector space
/ arXiv preprint arXiv: 1301.3781
@article{ART002631073}, author={Cheong, Yunam and Wang,Guehyun and Sanghoun Song}, title={A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun}, journal={Korean Semantics}, issn={1226-7198}, year={2020}, volume={69}, pages={253-281}, doi={10.19033/sks.2020.9.69.253}
TY - JOUR AU - Cheong, Yunam AU - Wang,Guehyun AU - Sanghoun Song TI - A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun JO - Korean Semantics PY - 2020 VL - 69 IS - null PB - The Society Of Korean Semantics SP - 253 EP - 281 SN - 1226-7198 AB - This study builds up a methodological pipeline to compare the lexical difference between the South Korean data and the North Korean data using the recent techniques of natural language processing. Assuming that the Chosun-ilbo and Rodong-shinmun are the counterpart of each other, we created the word embedding models to compare the distributional properties. We collected the data published in the newspapers from 2015 to 2017, preprocessed the texts, and then ran the Word2Vec library with the manipulated data. Classifying the patterns of lexical difference into four subtypes, we delved into how the two Korean languages have been lexically divergent. KW - Chosun-ilbo;Rodong-shinmun;lexical meaning;word embedding;Word2Vec;word cloud;North-South lexical meaning comparison;semantic variety;semantic prosody DO - 10.19033/sks.2020.9.69.253 ER -
Cheong, Yunam, Wang,Guehyun and Sanghoun Song. (2020). A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun. Korean Semantics, 69, 253-281.
Cheong, Yunam, Wang,Guehyun and Sanghoun Song. 2020, "A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun", Korean Semantics, vol.69, pp.253-281. Available from: doi:10.19033/sks.2020.9.69.253
Cheong, Yunam, Wang,Guehyun, Sanghoun Song "A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun" Korean Semantics 69 pp.253-281 (2020) : 253.
Cheong, Yunam, Wang,Guehyun, Sanghoun Song. A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun. 2020; 69 253-281. Available from: doi:10.19033/sks.2020.9.69.253
Cheong, Yunam, Wang,Guehyun and Sanghoun Song. "A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun" Korean Semantics 69(2020) : 253-281.doi: 10.19033/sks.2020.9.69.253
Cheong, Yunam; Wang,Guehyun; Sanghoun Song. A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun. Korean Semantics, 69, 253-281. doi: 10.19033/sks.2020.9.69.253
Cheong, Yunam; Wang,Guehyun; Sanghoun Song. A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun. Korean Semantics. 2020; 69 253-281. doi: 10.19033/sks.2020.9.69.253
Cheong, Yunam, Wang,Guehyun, Sanghoun Song. A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun. 2020; 69 253-281. Available from: doi:10.19033/sks.2020.9.69.253
Cheong, Yunam, Wang,Guehyun and Sanghoun Song. "A Comparative Study of Lexical Meanings in Chosun-ilbo and Rodong-shinmun" Korean Semantics 69(2020) : 253-281.doi: 10.19033/sks.2020.9.69.253