Text-Controlled 4D Human Generation (텍스트 조건 기반의 4차원 인간 생성)

Chanwoo Kim (김찬우); Sanghun Kim (김상헌); Hwasup Lim (임화섭)

@article{ART003329458},
author={Chanwoo Kim and Sanghun Kim and Hwasup Lim},
title={Text-Controlled 4D Human Generation},
journal={Journal of The Korea Society of Computer and Information},
issn={1598-849X},
year={2026},
volume={31},
number={4},
pages={55-66}

TY - JOUR
AU - Chanwoo Kim
AU - Sanghun Kim
AU - Hwasup Lim
TI - Text-Controlled 4D Human Generation
JO - Journal of The Korea Society of Computer and Information
PY - 2026
VL - 31
IS - 4
PB - The Korean Society Of Computer And Information
SP - 55
EP - 66
SN - 1598-849X
AB - Generating 4D humans from textual descriptions has become an important problem in various applications such as the metaverse and virtual reality. However, previous Text-to-4D generation methods typically generate appearance and motion jointly, which leads to limited controllability and high computational cost. In this paper, we propose a novel text-driven 4D human generation pipeline that integrates separately generated appearance and motion. First, from the given appearance and motion descriptions, a human appearance image and a motion sequence are generated using Stable Diffusion and Motion Diffusion Model, respectively. Next, MusePose combines the generated appearance and motion into a frontal-view video, which is then extended into multi-view videos using SV4D. Finally, Grid4D is employed to learn 4D representation from the synthesized multi-view videos. To validate the proposed pipeline, we construct a dataset for 4D human generation and conduct quantitative and qualitative evaluations on rendered videos. Experimental results show that the proposed method achieves 77.5% in Dynamic Degree, 58.3% in Aesthetic Quality, and 24.8% in Overall Consistency, indicating that while trade-offs exist among some metrics, the method maintains a balance between dynamic expressiveness and visual quality.
KW - Text-to-4D Generation;Dynamic Human Synthesis;Gaussian Splatting;Generative Model;Deep Learning
DO -
UR -
ER -

Chanwoo Kim, Sanghun Kim and Hwasup Lim. (2026). Text-Controlled 4D Human Generation. Journal of The Korea Society of Computer and Information, 31(4), 55-66.

Chanwoo Kim, Sanghun Kim and Hwasup Lim. 2026, "Text-Controlled 4D Human Generation", Journal of The Korea Society of Computer and Information, vol.31, no.4 pp.55-66.

Chanwoo Kim, Sanghun Kim, Hwasup Lim "Text-Controlled 4D Human Generation" Journal of The Korea Society of Computer and Information 31.4 pp.55-66 (2026) : 55.

Chanwoo Kim, Sanghun Kim, Hwasup Lim. Text-Controlled 4D Human Generation. 2026; 31(4), 55-66.

Chanwoo Kim, Sanghun Kim and Hwasup Lim. "Text-Controlled 4D Human Generation" Journal of The Korea Society of Computer and Information 31, no.4 (2026) : 55-66.

Chanwoo Kim; Sanghun Kim; Hwasup Lim. Text-Controlled 4D Human Generation. Journal of The Korea Society of Computer and Information, 31(4), 55-66.

Chanwoo Kim; Sanghun Kim; Hwasup Lim. Text-Controlled 4D Human Generation. Journal of The Korea Society of Computer and Information. 2026; 31(4) 55-66.

Chanwoo Kim, Sanghun Kim, Hwasup Lim. Text-Controlled 4D Human Generation. 2026; 31(4), 55-66.

Chanwoo Kim, Sanghun Kim and Hwasup Lim. "Text-Controlled 4D Human Generation" Journal of The Korea Society of Computer and Information 31, no.4 (2026) : 55-66.

KJCKorea
Journal Central

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Text-Controlled 4D Human Generation

ABSTRACT

KEYWORDS

Citation status

* References for papers published after 2024 are currently being built.

Journal of The Korea Society of Computer and Information 2024 KCI Impact Factor : 0.81

Text-Controlled 4D Human Generation

ABSTRACT

KEYWORDS

Statistics

Tools

Issue List

Citation status

KCI Citation Counts (0)

REFERENCES (0) * References for papers published after 2024 are currently being built.

Search PDF

Citation

* References for papers published after 2024 are currently being built.