AI responses to unethical directive speech acts: The case of indirect and evasive strategies (비윤리적 지시화행에 대한 인공지능의 응대 양상: 우회적 지시 전략을 중심으로)

Kim Jae-Hee (김재희); Hansaem Kim (김한샘)

@article{ART003282709},
author={Kim Jae-Hee and Hansaem Kim},
title={AI responses to unethical directive speech acts: The case of indirect and evasive strategies},
journal={The Sociolinguistic Journal of Korea},
issn={1226-4822},
year={2025},
volume={33},
number={4},
pages={43-80}

TY - JOUR
AU - Kim Jae-Hee
AU - Hansaem Kim
TI - AI responses to unethical directive speech acts: The case of indirect and evasive strategies
JO - The Sociolinguistic Journal of Korea
PY - 2025
VL - 33
IS - 4
PB - The Sociolinguistic Society Of Korea
SP - 43
EP - 80
SN - 1226-4822
AB - This study examines how Large Language Models (LLMs) recognize and refuse unethical directive speech acts by analyzing their responses to indirect and evasive user requests. Based on the Cross-Cultural Speech Act Realization Project (CCSARP), directive prompts were constructed by varying degrees of indirectness to evaluate the models’ pragmatic inference abilities. The study was conducted in two stages. First, a high rate of information leakage was observed for indirect directives using ChatGPT-4o (February 2025 version). Second, newer models—GPT-5, Claude Sonnet 3.7 and 4, and Gemini 2.5 Flash—were tested across four categories of unethical directives through multiturn dialogues. Logistic regression with Benjamini–Hochberg FDR correction revealed that although newer models displayed improved refusal performance overall, they remained vulnerable to highly indirect and non-conventional directives, particularly those related to discrimination and harmful behaviors. These results suggest that current AI safety systems rely heavily on surface-level keyword filtering, indicating the need for models to better learn diverse directive strategies and expressions in Korean. Moving beyond technology-centered safety evaluation, this study experimentally analyzes AI pragmatic response mechanisms and proposes directions for fostering ethical communication in future human–AI interactions.
KW - AI language ability evaluation;unethical directive speech acts;CCSARP;safety evaluation;ethical response
DO -
UR -
ER -

Kim Jae-Hee and Hansaem Kim. (2025). AI responses to unethical directive speech acts: The case of indirect and evasive strategies. The Sociolinguistic Journal of Korea, 33(4), 43-80.

Kim Jae-Hee and Hansaem Kim. 2025, "AI responses to unethical directive speech acts: The case of indirect and evasive strategies", The Sociolinguistic Journal of Korea, vol.33, no.4 pp.43-80.

Kim Jae-Hee, Hansaem Kim "AI responses to unethical directive speech acts: The case of indirect and evasive strategies" The Sociolinguistic Journal of Korea 33.4 pp.43-80 (2025) : 43.

Kim Jae-Hee, Hansaem Kim. AI responses to unethical directive speech acts: The case of indirect and evasive strategies. 2025; 33(4), 43-80.

Kim Jae-Hee and Hansaem Kim. "AI responses to unethical directive speech acts: The case of indirect and evasive strategies" The Sociolinguistic Journal of Korea 33, no.4 (2025) : 43-80.

Kim Jae-Hee; Hansaem Kim. AI responses to unethical directive speech acts: The case of indirect and evasive strategies. The Sociolinguistic Journal of Korea, 33(4), 43-80.

Kim Jae-Hee; Hansaem Kim. AI responses to unethical directive speech acts: The case of indirect and evasive strategies. The Sociolinguistic Journal of Korea. 2025; 33(4) 43-80.

Kim Jae-Hee, Hansaem Kim. AI responses to unethical directive speech acts: The case of indirect and evasive strategies. 2025; 33(4), 43-80.

Kim Jae-Hee and Hansaem Kim. "AI responses to unethical directive speech acts: The case of indirect and evasive strategies" The Sociolinguistic Journal of Korea 33, no.4 (2025) : 43-80.

KJCKorea
Journal Central

The Sociolinguistic Journal of Korea 2024 KCI Impact Factor : 0.55

AI responses to unethical directive speech acts: The case of indirect and evasive strategies

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

The Sociolinguistic Journal of Korea 2024 KCI Impact Factor : 0.55

AI responses to unethical directive speech acts: The case of indirect and evasive strategies

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (0) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.