본문 바로가기
  • Home

종합목록의 중복레코드 검증을 위한 알고리즘 연구

  • Journal of the Korean Society for Library and Information Science
  • 2003, 37(4), pp.69-88
  • Publisher : 한국문헌정보학회
  • Research Area : Interdisciplinary Studies > Library and Information Science

조순영 1

1한국교육학술정보원

Accredited

ABSTRACT

This study intends to develop a new duplicate detection algorithm to improve database quality. The new algorithm is developed to analyze by variables of language and bibliographic type, and it checks elements in bibliographic data, not just MARC fields. The algorithm computes the degree of similarity and the weight values to avoid possible elimination of records by simple input error. The study was performed on the 7,649 newly uploaded records during the last one year against the 210,000 sample master database. The findings show that the new algorithm has improved the duplicates recall rate by 36.2%.

Citation status

* References for papers published after 2022 are currently being built.