@article{ART003028647},
author={Ha-Na Jeong and Kim Jae Woong and Young-Suk Chung},
title={A Study on the Domain Discrimination Model of CSV Format Public Open Data},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2023},
volume={28},
number={12},
pages={129-136},
doi={10.9708/jksci.2023.28.12.129}
TY - JOUR
AU - Ha-Na Jeong
AU - Kim Jae Woong
AU - Young-Suk Chung
TI - A Study on the Domain Discrimination Model of CSV Format Public Open Data
JO - Journal of The Korea Society of Computer and Information
PY - 2023
VL - 28
IS - 12
PB - The Korean Society Of Computer And Information
SP - 129
EP - 136
SN - 1598-849X
AB - The government of the Republic of Korea is conducting quality management of public open data by conducting a public data quality management level evaluation. Public open data is provided in various open formats such as XML, JSON, and CSV, with CSV format accounting for the majority. When diagnosing the quality of public open data in CSV format, the quality diagnosis manager determines and diagnoses the domain for each field based on the field name and data within the field of the public open data file. However, it takes a lot of time because quality diagnosis is performed on large amounts of open data files. Additionally, in the case of fields whose meaning is difficult to understand, the accuracy of quality diagnosis is affected by the quality diagnosis person's ability to understand the data. This paper proposes a domain discrimination model for public open data in CSV format using field names and data distribution statistics to ensure consistency and accuracy so that quality diagnosis results are not influenced by the capabilities of the quality diagnosis person in charge, and to support shortening of diagnosis time. As a result of applying the model in this paper, the correct answer rate was about 77%, which is 2.8% higher than the file format open data diagnostic tool provided by the Ministry of Public Administration and Security. Through this, we expect to be able to improve accuracy when applying the proposed model to diagnosing and evaluating the quality management level of public data.
KW - Open data;Data quality;Quality improvement;Data quality diagnosis;Data Distribution
DO - 10.9708/jksci.2023.28.12.129
ER -
Ha-Na Jeong, Kim Jae Woong and Young-Suk Chung. (2023). A Study on the Domain Discrimination Model of CSV Format Public Open Data. Journal of The Korea Society of Computer and Information, 28(12), 129-136.
Ha-Na Jeong, Kim Jae Woong and Young-Suk Chung. 2023, "A Study on the Domain Discrimination Model of CSV Format Public Open Data", Journal of The Korea Society of Computer and Information, vol.28, no.12 pp.129-136. Available from: doi:10.9708/jksci.2023.28.12.129
Ha-Na Jeong, Kim Jae Woong, Young-Suk Chung "A Study on the Domain Discrimination Model of CSV Format Public Open Data" Journal of The Korea Society of Computer and Information 28.12 pp.129-136 (2023) : 129.
Ha-Na Jeong, Kim Jae Woong, Young-Suk Chung. A Study on the Domain Discrimination Model of CSV Format Public Open Data. 2023; 28(12), 129-136. Available from: doi:10.9708/jksci.2023.28.12.129
Ha-Na Jeong, Kim Jae Woong and Young-Suk Chung. "A Study on the Domain Discrimination Model of CSV Format Public Open Data" Journal of The Korea Society of Computer and Information 28, no.12 (2023) : 129-136.doi: 10.9708/jksci.2023.28.12.129
Ha-Na Jeong; Kim Jae Woong; Young-Suk Chung. A Study on the Domain Discrimination Model of CSV Format Public Open Data. Journal of The Korea Society of Computer and Information, 28(12), 129-136. doi: 10.9708/jksci.2023.28.12.129
Ha-Na Jeong; Kim Jae Woong; Young-Suk Chung. A Study on the Domain Discrimination Model of CSV Format Public Open Data. Journal of The Korea Society of Computer and Information. 2023; 28(12) 129-136. doi: 10.9708/jksci.2023.28.12.129
Ha-Na Jeong, Kim Jae Woong, Young-Suk Chung. A Study on the Domain Discrimination Model of CSV Format Public Open Data. 2023; 28(12), 129-136. Available from: doi:10.9708/jksci.2023.28.12.129
Ha-Na Jeong, Kim Jae Woong and Young-Suk Chung. "A Study on the Domain Discrimination Model of CSV Format Public Open Data" Journal of The Korea Society of Computer and Information 28, no.12 (2023) : 129-136.doi: 10.9708/jksci.2023.28.12.129