본문 바로가기
  • Home

SafeDP-Rewrite: Differentially Private Text Rewriting with Black-Box Access to Large Language Models

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2025, 30(11), pp.91~98
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : September 22, 2025
  • Accepted : November 7, 2025
  • Published : November 28, 2025

Jong Wook Kim 1

1상명대학교

Accredited

ABSTRACT

Text data is a critical resource in modern machine learning applications but often contains sensitive information, creating risks of privacy leakage when shared. Differential privacy (DP) provides a theoretical guarantee to prevent such leakage during data sharing, and recent work has explored its application to text rewriting using large language models (LLMs). However, most existing approaches assume a white-box setting with access to internal LLM model structures, making them impractical in real-world scenarios where only black-box API access is available. To address this limitation, we propose SafeDP-Rewrite, a DP-based text rewriting method that operates entirely in the black-box setting of LLMs. The proposed method generates diverse candidate sentences through random masking and applies the exponential mechanism to ensure DP in the final output. SafeDP-Rewrite requires neither additional training nor access to internal model information, making it simple and practical to deploy. Experiments on real-world datasets demonstrate that the proposed method preserves semantic fidelity and fluency while simultaneously achieving both privacy protection and utility.

Citation status

* References for papers published after 2024 are currently being built.