본문 바로가기
  • Home

Comparison of Web Crawler Performance for Web Record Management

  • The Korean Journal of Archival Studies
  • 2022, (74), pp.155-186
  • DOI : 10.20923/kjas.2022.74.155
  • Publisher : Korean Society Of Archival Studies
  • Research Area : Interdisciplinary Studies > Library and Information Science
  • Received : September 30, 2022
  • Accepted : October 22, 2022
  • Published : October 31, 2022

Chang Jinho 1 Kwon Hyuksang 1 Lee Kyumo 1 CHOI,DONG-JOON 2

1이씨플라자
2(주)한국무역정보통신

Accredited

ABSTRACT

As of 2022, the number of Internet sites for public institutions registered on the ‘Government 24’ website (www.gov.kr) of the Ministry of the Interior and Safety is 17,000. The direct transfer takes a lot of human and material resources and time between the records-producing institution and the records-management institution that manages websites as records. In addition, it is practically difficult for records management institutions to migrate and operate various software and application technologies required to run each website. A method of automatically collecting websites from a remote location using web crawler software is used domestically and abroad to overcome these practical limitations. This study compared the performance of the web crawler required to collect and manage public Internet websites as records remotely. The most suitable web crawler was selected through a step-by-step review of several web crawlers from previous studies and other literature. Several public agency websites were applied to compare the actual performance of the crawlers in the evaluation process. The study provides empirical and specific performance comparison information for organizations that need to choose a web crawler.

Citation status

* References for papers published after 2022 are currently being built.