Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization (생성 모델을 이용한 데이터 프리 양자화를 위한 Bit-width Aware Generator와 채널 어텐션 기반 중간 레이어 지식 증류)

Jae-Yong Baek (백재용); Du-Hwan Hur (허두환); Deok-Woong Kim (김덕웅); Yong-Sang Yoo (유용상); Hyuk-Jin Shin (신혁진); Dae-Hyeon Park (박대현); Seung-Hwan Bae (배승환)

doi:10.9708/jksci.2024.29.07.011

Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization

Journal of The Korea Society of Computer and Information
Abbr : JKSCI
2024, 29(7), pp.11~20
DOI : 10.9708/jksci.2024.29.07.011
Publisher : The Korean Society Of Computer And Information
Research Area : Engineering > Computer Science
Received : May 23, 2024
Accepted : June 26, 2024
Published : July 31, 2024

Jae-Yong Baek ¹, Du-Hwan Hur ¹, Deok-Woong Kim ¹, Yong-Sang Yoo ¹, Hyuk-Jin Shin ², Dae-Hyeon Park ¹, Seung-Hwan Bae ¹

¹인하대학교
²인하대학교 인공지능융합연구센터

Accredited

ABSTRACT

In this paper, we propose the BAG (Bit-width Aware Generator) and the Intermediate Layer Knowledge Distillation using Channel-wise Attention to reduce the knowledge gap between a quantized network, a full-precision network, and a generator in GDFQ (Generative Data-Free Quantization). Since the generator in GDFQ is only trained by the feedback from the full-precision network, the gap resulting in decreased capability due to low bit-width of the quantized network has no effect on training the generator. To alleviate this problem, BAG is quantized with same bit-width of the quantized network, and it can generate synthetic images, which are effectively used for training the quantized network. Typically, the knowledge gap between the quantized network and the full-precision network is also important. To resolve this, we compute channel-wise attention of outputs of convolutional layers, and minimize the loss function as the distance of them. As the result, the quantized network can learn which channels to focus on more from mimicking the full-precision network. To prove the efficiency of proposed methods, we quantize the network trained on CIFAR-100 with 3 bit-width weights and activations, and train it and the generator with our method. As the result, we achieve 56.14% Top-1 Accuracy and increase 3.4% higher accuracy compared to our baseline AdaDFQ.

KEYWORDS

Neural Network Quantization, Data-free Quantization, Generative model, Knowledge Distillation, Attention Mechanism

Citation status

* References for papers published after 2024 are currently being built.

[web] Han, Song / 2015 / Deep compression : Compressing deep neural networks with pruning, trained quantization and huffman coding / arXiv

[confproc] Jacob, Benoit / 2018 / Quantization and training of neural networks for efficient integer-arithmetic-only inference / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 2704~2713

[journal] Gray, Robert M. / 1998 / Quantization / IEEE transactions on information theory 44(6) : 7308~7316

[confproc] Yang, Jiwei / 2018 / Quantization networks / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 2704~2713

[confproc] Nagel, Markus / 2019 / Data-free quantization through weight equalization and bias correction / Proceedings of the IEEE/CVF International Conference on Computer Vision : 1325~1334

[confproc] Liu, Xingchao / 2021 / Post-training quantization with multiple points: Mixed precision without mixed precision / Proceedings of the AAAI conference on artificial intelligence 35(10) : 8697~8705

[confproc] Li, Yuhang / 2021 / Brecq: Pushing the limit of post-training quantization by block reconstruction / Proceedings of 9th International Conference on Learning Representations

[confproc] Wang, Kuan / 2019 / Haq: Hardware-aware automated quantization with mixed precision / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 8612~8620

[confproc] Esser, Steven K. / 2020 / Learned step size quantization / Proceedings of 8th International Conference on Learning Representations

[confproc] Xu, Shoukai / 2021 / Generative low-bitwidth data free quantization / Computer Vision–ECCV 2020: 16th European Conference 12357 : 1~17

[confproc] Liu, Yuang / 2021 / Zero-shot adversarial quantization / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 1512~1521

[confproc] Qian, Biao / 2023 / Adaptive data-free quantization / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 7960~7968

[web] G. Hinton / 2015 / Distilling the Knowledge in a Neural Network / arXiv / arXiv.1503.02531

[confproc] Romero, Adriana / 2015 / FitNets: Hints for Thin Deep Nets / Proceedings of 8th International Conference on Learning Representations

[confproc] Chen, Defang / 2021 / Cross-layer distillation with semantic calibration / Proceedings of the AAAI conference on artificial intelligence 35(8) : 7028~7036

[confproc] Goodfellow, Ian / 2014 / Generative adversarial nets / Advances in Neural Information Processing Systems : 2672~2680

[confproc] Radford, Alec / 2016 / Unsupervised representation learning with deep convolutional generative adversarial networks / Proceedings of 4th International Conference on Learning Representations

[confproc] Odena, Augustus / 2017 / Conditional image synthesis with auxiliary classifier gans / International conference on machine learning 70 : 2642~2651

[confproc] Li, Shaojie / 2021 / Revisiting discriminator in gan compression : A generator-discriminator cooperative compression scheme / Advances in Neural Information Processing Systems : 28560~28572

[confproc] Metz, Luke / 2017 / Unrolled generative adversarial networks / Proceedings of 5th International Conference on Learning Representations

[confproc] He, Kaiming / 2016 / Deep residual learning for image recognition / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 770~778

[report] Alex Krizhevsky / 2009 / Learning multiple layers of features from tiny images : 1~60

[confproc] Cai, Yaohui / 2020 / Zeroq: A novel zero shot quantization framework / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 13166~13175

[confproc] Li, Huantong / 2023 / Hard sample matters a lot in zero-shot quantization / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 24417~24426

[confproc] Choi, Kanghyun / 2021 / Qimera: Data-free quantization with synthetic boundary supporting samples / Advances in Neural Information Processing Systems : 14835~14847

[confproc] Zhang, Xiangguo / 2021 / Diversifying sample generation for accurate data-free quantization / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 15658~15667

[confproc] Choi, Kanghyun / 2022 / It's all in the teacher: Zero-shot quantization brought closer to the teacher / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 8301~8311

[confproc] S. Zagoruyko / 2017 / Paying more attention to attention: improving the performance of convolutional neural networks via attention transfer / Proceedings of 5th International Conference on Learning Representations

[confproc] Ioffe, Sergey / 2015 / Batch normalization:Accelerating deep network training by reducing internal covariate shift / International conference on machine learning 37 : 448~456

[confproc] Maas, Andrew L. / 2013 / Rectifier nonlinearities improve neural network acoustic models / International conference on machine learning 30 : 3~8

[confproc] Kingma, Diederik P. / 2015 / Adam: A method for stochastic optimization / Proceeding of 3rd International Conference on Learning Representations

[confproc] Zhu, Baozhou / 2021 / Autorecon: Neural architecture search-based reconstruction for data-free compression / Proceedings of the AAAI conference on artificial intelligence : 3470~3476

[confproc] Zhong, Yunshan / 2022 / Intraq: Learning synthetic images with intra-class heterogeneity for zero-shot network quantization / Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition : 12329~12338

[confproc] Qian, Biao / 2023 / Rethinking data-free quantization as a zero-sum game / Proceedings of the AAAI conference on artificial intelligence 37(8) : 9489~9497

[confproc] Chen, Xinrui / 2023 / TexQ: Zero-shot Network Quantization with Texture Feature Distribution Calibration / Advances in Neural Information Processing Systems

[journal] Li, Jixing / 2024 / ACQ: Improving generative data-free quantization via attention correction / Pattern Recognition 152 : 110444~

[confproc] I. P. Durugkar / 2017 / Generative multi-adversarial networks / Proceedings of 5th International Conference on Learning Representations

This paper was written with support from the National Research Foundation of Korea.

KJCKorea
Journal Central

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Bit-width Aware Generator and Intermediate Layer Knowledge Distillation using Channel-wise Attention for Generative Data-Free Quantization

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (37) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.