본문 바로가기
  • Home

Optimization of a Hybrid RAG System for Korean Legal QA

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2025, 30(8), pp.53~63
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : July 15, 2025
  • Accepted : August 11, 2025
  • Published : August 29, 2025

Jun-Won Seo 1 Junghye Min 1

1인하공업전문대학

Accredited

ABSTRACT

Legal question-answering systems demand high reliability and accuracy, and large language models (LLMs) have recently been actively explored to meet these requirements. However, pretrained LLMs often struggle to reflect the most recent case law or specific legal provisions, which can lead to so-called “hallucination” — the generation of factually incorrect information. To address this issue, Retrieval-Augmented Generation (RAG), which generates responses based on external documents, has received growing attention. This study aims to develop a RAG system tailored to the Korean legal domain by optimizing key components including document chunking, embedding models, and retrieval strategies. Experimental results show that combining BM25 with a fine-tuned embedding model trained on Korean legal data, applied to semantically chunked documents, yields the best performance. The proposed hybrid retrieval approach outperformed baseline methods in both retrieval accuracy and factual consistency of the generated answers.

Citation status

* References for papers published after 2024 are currently being built.