MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure

Jaeyong Jung (정재용); Nak Hyun Jung (정낙현)

@article{ART003306372},
author={Jaeyong Jung and Nak Hyun Jung},
title={MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure},
journal={Journal of Internet of Things and Convergence},
issn={2466-0078},
year={2026},
volume={12},
number={1},
pages={75-101}

TY - JOUR
AU - Jaeyong Jung
AU - Nak Hyun Jung
TI - MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure
JO - Journal of Internet of Things and Convergence
PY - 2026
VL - 12
IS - 1
PB - The Korea Internet of Things Society
SP - 75
EP - 101
SN - 2466-0078
AB - This study proposes the Multi-Scale Hierarchical Embedding Compression (MSHC) framework for on-device deployment of large language model (LLM) embeddings. Unlike previous studies focused on the trade-off between compression rate and performance preservation, this research introduces a new perspective: ‘designability of semantic structure after compression’. MSHC consists of a four-stage pipeline: Matryoshka Representation Learning (MRL), Soft-to-Hard Vector Quantization (VQ), Product Quantization (PQ), and Sparse Autoencoder (SAE), preserving semantic hierarchical structure at each stage. Experiments on Korean benchmarks (KLUE-STS, KLUE-NLI, NSMC) showed that applying MRL alone maintained 96.8% performance at 8x compression. Task-aware fine-tuning significantly restored the performance of the MSHC pipeline. Furthermore, experiments with hierarchical weighted search and coarse-to-fine search demonstrated that MSHC effectively preserves the appropriate semantic structure for multi-stage search pipelines. This study proposes shifting the evaluation criterion for embedding compression from ‘immediate performance’ to ‘semantic manipulability’.
KW - Embedding compression;Matryoshka representation learning;Semantic structure preservation;On-device inference;Multi-stage retrieval
DO -
UR -
ER -

Jaeyong Jung and Nak Hyun Jung. (2026). MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure. Journal of Internet of Things and Convergence, 12(1), 75-101.

Jaeyong Jung and Nak Hyun Jung. 2026, "MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure", Journal of Internet of Things and Convergence, vol.12, no.1 pp.75-101.

Jaeyong Jung, Nak Hyun Jung "MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure" Journal of Internet of Things and Convergence 12.1 pp.75-101 (2026) : 75.

Jaeyong Jung, Nak Hyun Jung. MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure. 2026; 12(1), 75-101.

Jaeyong Jung and Nak Hyun Jung. "MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure" Journal of Internet of Things and Convergence 12, no.1 (2026) : 75-101.

Jaeyong Jung; Nak Hyun Jung. MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure. Journal of Internet of Things and Convergence, 12(1), 75-101.

Jaeyong Jung; Nak Hyun Jung. MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure. Journal of Internet of Things and Convergence. 2026; 12(1) 75-101.

Jaeyong Jung, Nak Hyun Jung. MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure. 2026; 12(1), 75-101.

Jaeyong Jung and Nak Hyun Jung. "MSHC: A Multi-Scale Hierarchical Embedding Compression Framework Preserving Semantic Structure" Journal of Internet of Things and Convergence 12, no.1 (2026) : 75-101.

KJCKorea
Journal Central

Journal of Internet of Things and Convergence 2025 KCI Impact Factor : 0.75