@article{ART002406727},
author={Tae-Su Kim and Kim Jong Wook},
title={Efficient K-Anonymization Implementation with Apache Spark},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2018},
volume={23},
number={11},
pages={17-24},
doi={10.9708/jksci.2018.23.11.017}
TY - JOUR
AU - Tae-Su Kim
AU - Kim Jong Wook
TI - Efficient K-Anonymization Implementation with Apache Spark
JO - Journal of The Korea Society of Computer and Information
PY - 2018
VL - 23
IS - 11
PB - The Korean Society Of Computer And Information
SP - 17
EP - 24
SN - 1598-849X
AB - Today, we are living in the era of data and information. With the advent of Internet of Things (IoT), the popularity of social networking sites, and the development of mobile devices, a large amount of data is being produced in diverse areas. The collection of such data generated in various area is called big data. As the importance of big data grows, there has been a growing need to share big data containing information regarding an individual entity. As big data contains sensitive information about individuals, directly releasing it for public use may violate existing privacy requirements. Thus, privacy-preserving data publishing (PPDP) has been actively studied to share big data containing personal information for public use, while preserving the privacy of the individual.
K-anonymity, which is the most popular method in the area of PPDP, transforms each record in a table such that at least k records have the same values for the given quasi-identifier attributes, and thus each record is indistinguishable from other records in the same class. As the size of big data continuously getting larger, there is a growing demand for the method which can efficiently anonymize vast amount of dta. Thus, in this paper, we develop an efficient k-anonymity method by using Spark distributed framework. Experimental results show that, through the developed method, significant gains in processing time can be achieved.
KW - K-anonymity;Spark;Hadoop;Distributed system;Data privacy
DO - 10.9708/jksci.2018.23.11.017
ER -
Tae-Su Kim and Kim Jong Wook. (2018). Efficient K-Anonymization Implementation with Apache Spark. Journal of The Korea Society of Computer and Information, 23(11), 17-24.
Tae-Su Kim and Kim Jong Wook. 2018, "Efficient K-Anonymization Implementation with Apache Spark", Journal of The Korea Society of Computer and Information, vol.23, no.11 pp.17-24. Available from: doi:10.9708/jksci.2018.23.11.017
Tae-Su Kim, Kim Jong Wook "Efficient K-Anonymization Implementation with Apache Spark" Journal of The Korea Society of Computer and Information 23.11 pp.17-24 (2018) : 17.
Tae-Su Kim, Kim Jong Wook. Efficient K-Anonymization Implementation with Apache Spark. 2018; 23(11), 17-24. Available from: doi:10.9708/jksci.2018.23.11.017
Tae-Su Kim and Kim Jong Wook. "Efficient K-Anonymization Implementation with Apache Spark" Journal of The Korea Society of Computer and Information 23, no.11 (2018) : 17-24.doi: 10.9708/jksci.2018.23.11.017
Tae-Su Kim; Kim Jong Wook. Efficient K-Anonymization Implementation with Apache Spark. Journal of The Korea Society of Computer and Information, 23(11), 17-24. doi: 10.9708/jksci.2018.23.11.017
Tae-Su Kim; Kim Jong Wook. Efficient K-Anonymization Implementation with Apache Spark. Journal of The Korea Society of Computer and Information. 2018; 23(11) 17-24. doi: 10.9708/jksci.2018.23.11.017
Tae-Su Kim, Kim Jong Wook. Efficient K-Anonymization Implementation with Apache Spark. 2018; 23(11), 17-24. Available from: doi:10.9708/jksci.2018.23.11.017
Tae-Su Kim and Kim Jong Wook. "Efficient K-Anonymization Implementation with Apache Spark" Journal of The Korea Society of Computer and Information 23, no.11 (2018) : 17-24.doi: 10.9708/jksci.2018.23.11.017