@article{ART002708961},
author={SungJin Kim and NakJin Choi and Lee Jun Dong},
title={A Study on the Classification of Unstructured Data through Morpheme Analysis},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2021},
volume={26},
number={4},
pages={105-112},
doi={10.9708/jksci.2021.26.04.105}
TY - JOUR
AU - SungJin Kim
AU - NakJin Choi
AU - Lee Jun Dong
TI - A Study on the Classification of Unstructured Data through Morpheme Analysis
JO - Journal of The Korea Society of Computer and Information
PY - 2021
VL - 26
IS - 4
PB - The Korean Society Of Computer And Information
SP - 105
EP - 112
SN - 1598-849X
AB - In the era of big data, interest in data is exploding. In particular, the development of the Internet and social media has led to the creation of new data, enabling the realization of the era of big data and artificial intelligence and opening a new chapter in convergence technology. Also, in the past, there are many demands for analysis of data that could not be handled by programs.
In this paper, an analysis model was designed and verified for classification of unstructured data, which is often required in the era of big data. Data crawled DBPia's thesis summary, main words, and sub-keyword, and created a database using KoNLP’s data dictionary, and tokenized words through morpheme analysis. In addition, nouns were extracted using KAIST's 9 part-of-speech classification system, TF-IDF values were generated, and an analysis dataset was created by combining training data and Y values. Finally, The adequacy of classification was measured by applying three analysis algorithms(random forest, SVM, decision tree) to the generated analysis dataset.
The classification model technique proposed in this paper can be usefully used in various fields such as civil complaint classification analysis and text-related analysis in addition to thesis classification.
KW - Big Data;Data Analysis;Visualization;Textmining;Modeling
DO - 10.9708/jksci.2021.26.04.105
ER -
SungJin Kim, NakJin Choi and Lee Jun Dong. (2021). A Study on the Classification of Unstructured Data through Morpheme Analysis. Journal of The Korea Society of Computer and Information, 26(4), 105-112.
SungJin Kim, NakJin Choi and Lee Jun Dong. 2021, "A Study on the Classification of Unstructured Data through Morpheme Analysis", Journal of The Korea Society of Computer and Information, vol.26, no.4 pp.105-112. Available from: doi:10.9708/jksci.2021.26.04.105
SungJin Kim, NakJin Choi, Lee Jun Dong "A Study on the Classification of Unstructured Data through Morpheme Analysis" Journal of The Korea Society of Computer and Information 26.4 pp.105-112 (2021) : 105.
SungJin Kim, NakJin Choi, Lee Jun Dong. A Study on the Classification of Unstructured Data through Morpheme Analysis. 2021; 26(4), 105-112. Available from: doi:10.9708/jksci.2021.26.04.105
SungJin Kim, NakJin Choi and Lee Jun Dong. "A Study on the Classification of Unstructured Data through Morpheme Analysis" Journal of The Korea Society of Computer and Information 26, no.4 (2021) : 105-112.doi: 10.9708/jksci.2021.26.04.105
SungJin Kim; NakJin Choi; Lee Jun Dong. A Study on the Classification of Unstructured Data through Morpheme Analysis. Journal of The Korea Society of Computer and Information, 26(4), 105-112. doi: 10.9708/jksci.2021.26.04.105
SungJin Kim; NakJin Choi; Lee Jun Dong. A Study on the Classification of Unstructured Data through Morpheme Analysis. Journal of The Korea Society of Computer and Information. 2021; 26(4) 105-112. doi: 10.9708/jksci.2021.26.04.105
SungJin Kim, NakJin Choi, Lee Jun Dong. A Study on the Classification of Unstructured Data through Morpheme Analysis. 2021; 26(4), 105-112. Available from: doi:10.9708/jksci.2021.26.04.105
SungJin Kim, NakJin Choi and Lee Jun Dong. "A Study on the Classification of Unstructured Data through Morpheme Analysis" Journal of The Korea Society of Computer and Information 26, no.4 (2021) : 105-112.doi: 10.9708/jksci.2021.26.04.105