Recently, with the development of deep learning technology, a variety of huge models with excellent performance have been devised by pre-training massive amounts of text data. However, in order for such a model to be applied to real-life services, the inference speed must be fast and the amount of computation must be low, so the technology for model compression is attracting attention. Knowledge distillation, a representative model compression, is attracting attention as it can be used in a variety of ways as a method of transferring the knowledge already learned by the teacher model to a relatively small-sized student model. However, knowledge distillation has a limitation in that it is difficult to solve problems with low similarity to previously learned data because only knowledge necessary for solving a given problem is learned in a teacher model and knowledge distillation to a student model is performed from the same point of view. Therefore, we propose a heterogeneous knowledge distillation method in which the teacher model learns a higher-level concept rather than the knowledge required for the task that the student model needs to solve, and the teacher model distills this knowledge to the student model. In addition, through classification experiments on about 18,000 documents, we confirmed that the heterogeneous knowledge distillation method showed superior performance in all aspects of learning efficiency and accuracy compared to the traditional knowledge distillation.
@article{ART002889736}, author={Yerin Yu and Namgyu Kim}, title={Text Classification Using Heterogeneous Knowledge Distillation}, journal={Journal of The Korea Society of Computer and Information}, issn={1598-849X}, year={2022}, volume={27}, number={10}, pages={29-41}, doi={10.9708/jksci.2022.27.10.029}
TY - JOUR AU - Yerin Yu AU - Namgyu Kim TI - Text Classification Using Heterogeneous Knowledge Distillation JO - Journal of The Korea Society of Computer and Information PY - 2022 VL - 27 IS - 10 PB - The Korean Society Of Computer And Information SP - 29 EP - 41 SN - 1598-849X AB - Recently, with the development of deep learning technology, a variety of huge models with excellent performance have been devised by pre-training massive amounts of text data. However, in order for such a model to be applied to real-life services, the inference speed must be fast and the amount of computation must be low, so the technology for model compression is attracting attention. Knowledge distillation, a representative model compression, is attracting attention as it can be used in a variety of ways as a method of transferring the knowledge already learned by the teacher model to a relatively small-sized student model. However, knowledge distillation has a limitation in that it is difficult to solve problems with low similarity to previously learned data because only knowledge necessary for solving a given problem is learned in a teacher model and knowledge distillation to a student model is performed from the same point of view. Therefore, we propose a heterogeneous knowledge distillation method in which the teacher model learns a higher-level concept rather than the knowledge required for the task that the student model needs to solve, and the teacher model distills this knowledge to the student model. In addition, through classification experiments on about 18,000 documents, we confirmed that the heterogeneous knowledge distillation method showed superior performance in all aspects of learning efficiency and accuracy compared to the traditional knowledge distillation. KW - Deep Learning;Knowledge Distillation;Text Classification;Model Compression DO - 10.9708/jksci.2022.27.10.029 ER -
Yerin Yu and Namgyu Kim. (2022). Text Classification Using Heterogeneous Knowledge Distillation. Journal of The Korea Society of Computer and Information, 27(10), 29-41.
Yerin Yu and Namgyu Kim. 2022, "Text Classification Using Heterogeneous Knowledge Distillation", Journal of The Korea Society of Computer and Information, vol.27, no.10 pp.29-41. Available from: doi:10.9708/jksci.2022.27.10.029
Yerin Yu, Namgyu Kim "Text Classification Using Heterogeneous Knowledge Distillation" Journal of The Korea Society of Computer and Information 27.10 pp.29-41 (2022) : 29.
Yerin Yu, Namgyu Kim. Text Classification Using Heterogeneous Knowledge Distillation. 2022; 27(10), 29-41. Available from: doi:10.9708/jksci.2022.27.10.029
Yerin Yu and Namgyu Kim. "Text Classification Using Heterogeneous Knowledge Distillation" Journal of The Korea Society of Computer and Information 27, no.10 (2022) : 29-41.doi: 10.9708/jksci.2022.27.10.029
Yerin Yu; Namgyu Kim. Text Classification Using Heterogeneous Knowledge Distillation. Journal of The Korea Society of Computer and Information, 27(10), 29-41. doi: 10.9708/jksci.2022.27.10.029
Yerin Yu; Namgyu Kim. Text Classification Using Heterogeneous Knowledge Distillation. Journal of The Korea Society of Computer and Information. 2022; 27(10) 29-41. doi: 10.9708/jksci.2022.27.10.029
Yerin Yu, Namgyu Kim. Text Classification Using Heterogeneous Knowledge Distillation. 2022; 27(10), 29-41. Available from: doi:10.9708/jksci.2022.27.10.029
Yerin Yu and Namgyu Kim. "Text Classification Using Heterogeneous Knowledge Distillation" Journal of The Korea Society of Computer and Information 27, no.10 (2022) : 29-41.doi: 10.9708/jksci.2022.27.10.029