본문 바로가기
  • Home

Extraction of similar XML data based on XML structure and processing unit

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2017, 22(4), pp.59-65
  • DOI : 10.9708/jksci.2017.22.04.059
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science
  • Received : March 14, 2017
  • Accepted : April 14, 2017
  • Published : April 28, 2017

1)Jong-Hyun Park 1

1충남대학교

Accredited

ABSTRACT

XML has established itself as the format for data exchange on the internet and the volume of its instance is large scale. Therefore, to extract similar information from XML instance is one of research topics but is insufficient. In this paper, we extract similar information from various kind of XML instances according to the same goal. Also we use only the structure information of XML instance for information extraction because some of XML instance is described without its schema. In order to efficiently extract similar information, we propose a minimum unit of processing and two approaches for finding the unit. The one is a structure-based method which uses only the structure information of XML instance and another is a measure-based method which finds a unit by numerical formula. Our two approaches can be applied to any application that needs the extraction of similar information based on XML data. Also the approach can be used for HTML instance.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.