KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks

Seungjae Park (박승재); Sung-Bae Cho (조성배); Ha Young Kim (김하영)

doi:10.9708/jksci.2025.30.08.029

@article{ART003234176},
author={Seungjae Park and Sung-Bae Cho and Ha Young Kim},
title={KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2025},
volume={30},
number={8},
pages={29-39},
doi={10.9708/jksci.2025.30.08.029}

TY - JOUR
AU - Seungjae Park
AU - Sung-Bae Cho
AU - Ha Young Kim
TI - KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks
JO - Journal of The Korea Society of Computer and Information
PY - 2025
VL - 30
IS - 8
PB - The Korean Society Of Computer And Information
SP - 29
EP - 39
SN - 1598-849X
AB - Periodic corporate filings are structured documents combining text and tables. Practical use of these documents requires comprehensive reasoning to integrate and interpret information across multiple sections. However, current large language models (LLMs) struggle with such reasoning, and existing financial benchmarks are insufficient for evaluating practical skills like tool usage. To address this gap, we develop KRAFT³-QA, a new benchmark based on Korean corporate filings. KRAFT³-QA consists of multiple-choice tasks that require integrating information across various sections. Model performance is evaluated using both accuracy and valid response rate. Experiments with major open LLMs demonstrate that model scale and reasoning architecture can affect performance. This study presents a real document- -based, tool-augmented QA benchmark and an evaluation framework, establishing a technical foundation for quantitatively assessing the real-world problem-solving capabilities of LLM agents.
KW - Large Language Model;Benchmark Dataset;Text-Table Question Answering;Tool-augmented Agent;Financial Domain
DO - 10.9708/jksci.2025.30.08.029
ER -

Seungjae Park, Sung-Bae Cho and Ha Young Kim. (2025). KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks. Journal of The Korea Society of Computer and Information, 30(8), 29-39.

Seungjae Park, Sung-Bae Cho and Ha Young Kim. 2025, "KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks", Journal of The Korea Society of Computer and Information, vol.30, no.8 pp.29-39. Available from: doi:10.9708/jksci.2025.30.08.029

Seungjae Park, Sung-Bae Cho, Ha Young Kim "KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks" Journal of The Korea Society of Computer and Information 30.8 pp.29-39 (2025) : 29.

Seungjae Park, Sung-Bae Cho, Ha Young Kim. KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks. 2025; 30(8), 29-39. Available from: doi:10.9708/jksci.2025.30.08.029

Seungjae Park, Sung-Bae Cho and Ha Young Kim. "KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks" Journal of The Korea Society of Computer and Information 30, no.8 (2025) : 29-39.doi: 10.9708/jksci.2025.30.08.029

Seungjae Park; Sung-Bae Cho; Ha Young Kim. KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks. Journal of The Korea Society of Computer and Information, 30(8), 29-39. doi: 10.9708/jksci.2025.30.08.029

Seungjae Park; Sung-Bae Cho; Ha Young Kim. KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks. Journal of The Korea Society of Computer and Information. 2025; 30(8) 29-39. doi: 10.9708/jksci.2025.30.08.029

Seungjae Park, Sung-Bae Cho, Ha Young Kim. KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks. 2025; 30(8), 29-39. Available from: doi:10.9708/jksci.2025.30.08.029

Seungjae Park, Sung-Bae Cho and Ha Young Kim. "KRAFT³-QA: Korean financial text-table benchmark for evaluating tool-augmented agents on QA tasks" Journal of The Korea Society of Computer and Information 30, no.8 (2025) : 29-39.doi: 10.9708/jksci.2025.30.08.029

KJCKorea
Journal Central

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81