본문 바로가기
  • Home

Study on Extraction of Keywords Using TF-IDF and Text Structure of Novels

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2015, 20(2), pp.121-129
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

YOU Eun-Soon 1 최건희 1 Seung-Hoon Kim 1

1단국대학교

Accredited

ABSTRACT

With the explosive growth of information about books, there is a growing number of customers who findit difficult to pick a book. Against the backdrop, the importance of a book recommendation systembecomes greater, through which appropriate information about books could be offered then to encouragecustomers to buy a book in the end. However, existing recommendation systems based on thebibliographical information or user data reveal the reliability issue found in their recommendation results. This is why it is necessary to reflect semantic information extracted from the texts of a book’s main bodyin a recommendation system. Accordingly, this paper suggests a method for extracting keywords from themain body of novels, as a preceding research, by using TF-IDF method as well as the text structure. Tothis end, the texts of 100 novels have been collected then to divide them into four structural elements ofpreface, dialogue, non-dialogue and closing. Then, the TF-IDF weight of each keyword has beencalculated. The calculation results show that the extraction accuracy of keywords improves by 42.1% inperformance when more weight is given to dialogue while including preface and closing instead of usingjust the main body.

Citation status

* References for papers published after 2023 are currently being built.