본문 바로가기
  • Home

Principles and Status of Text-to-Image Generation Artificial Intelligence

  • The Journal of Aesthetics and Science of Art
  • Abbr : JASA
  • 2025, 75(), 9
  • Publisher : 한국미학예술학회
  • Research Area : Arts and Kinesiology > Other Arts and Kinesiology
  • Received : April 28, 2025
  • Accepted : May 18, 2025
  • Published : June 30, 2025

Pyung-Jong Park 1

1중앙대학교

Accredited

ABSTRACT

This article grasps the principles of text-to-image generation models and examines key issues related to this new image generation method. The text-to-image generation model was created by combining the large language model and image generation model, which revolutionized natural language processing. The field of computer vision has set a turning point for a leap forward thanks to language understanding with pre-trained learning and the advanced image creation ability of diffusion models. As if competing, Open AI and Google developed new models and led the advancement of image generation capabilities with technological improvements. As a result, it is now possible to generate very realistic, high-quality images simply by entering text prompts. Additionally, elements that are not in the learning data can be generated, increasing the diversity of images. The success or failure of a text-to-image generation model depends on the fit and quality of the text and image, and it is important to create an appropriate prompt. Artificial intelligence automatically generates information, but appropriate human intervention is required in the text-to-image generation model.

Citation status

* References for papers published after 2024 are currently being built.

This paper was written with support from the National Research Foundation of Korea.