본문 바로가기
  • Home

ProbCert: Probabilistic Certification Framework for Black-box LLM Outputs

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2026, 31(2), pp.65~73
  • DOI : 10.9708/jksci.2026.31.02.065
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : December 29, 2025
  • Accepted : February 19, 2026
  • Published : February 27, 2026

Jong Wook Kim 1

1상명대학교

Accredited

ABSTRACT

Large language models (LLMs) are widely used in natural language processing applications such as classification, summarization, and question answering. However, commercial LLMs are typically provided as black-box APIs, making it difficult to interpret the causes of their outputs or to quantitatively assess their reliability. In particular, existing approaches fail to provide a probabilistic characterization of how often output changes occur in response to input variations, or how consistently such changes arise. To address this limitation, this paper proposes ProbCert, a framework for estimating and certifying output change probabilities under input perturbations in black-box LLM settings. ProbCert repeatedly generates semantically valid input variations, observes whether output changes occur, and estimates the corresponding change probability, while continuing queries until a user-specified confidence level and error tolerance are satisfied. The framework integrates multiple confidence interval estimation methods, including the Wilson score interval, Empirical Bernstein bound, and the Clopper–Pearson interval, enabling systematic comparison of estimation accuracy and query efficiency under a unified procedure. Experimental results on both classification and generation tasks demonstrate that all variants of ProbCert reliably satisfy the specified confidence and error requirements. In particular, the Wilson score–based variant achieves certification with the fewest LLM queries, highlighting its practical efficiency in commercial LLM environments.

Citation status

* References for papers published after 2024 are currently being built.