본문 바로가기
  • Home

A Study on the Deduction of Social Issues Applying Word Embedding: With an Empasis on News Articles related to the Disables

  • Journal of the Korean Society for Information Management
  • Abbr : JKOSIM
  • 2018, 35(1), pp.231~250
  • DOI : 10.3743/KOSIM.2018.35.1.231
  • Publisher : 한국정보관리학회
  • Research Area : Interdisciplinary Studies > Library and Information Science
  • Received : March 8, 2018
  • Accepted : March 22, 2018
  • Published : March 30, 2018

Garam Choi 1 Sung-Pil Choi 1

1경기대학교

Accredited

ABSTRACT

In this paper, we propose a new methodology for extracting and formalizing subjective topics at a specific time using a set of keywords extracted automatically from online news articles. To do this, we first extracted a set of keywords by applying TF-IDF methods selected by a series of comparative experiments on various statistical weighting schemes that can measure the importance of individual words in a large set of texts. In order to effectively calculate the semantic relation between extracted keywords, a set of word embedding vectors was constructed by using about 1,000,000 news articles collected separately. Individual keywords extracted were quantified in the form of numerical vectors and clustered by K-means algorithm. As a result of qualitative in-depth analysis of each keyword cluster finally obtained, we witnessed that most of the clusters were evaluated as appropriate topics with sufficient semantic concentration for us to easily assign labels to them.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.