본문 바로가기
  • Home

A Study on the Semiautomatic Construction of Domain - Specific Relation Extraction Datasets from Biomedical Abstracts - Mainly Focusing on a Genic Interaction Dataset in Alzheimer’s Disease Domain -

Sung-Pil Choi 1 Seok Jong Yu 2 Hyun Yang Cho 1

1경기대학교
2한국과학기술정보연구원

Accredited

ABSTRACT

This paper introduces a software system and process model for constructing domain-specific relation extraction datasets semi-automatically. The system uses a set of terms such as genes, proteins diseases and so forth as inputs and then by exploiting massive biological interaction database, generates a set of term pairs which are utilized as queries for retrieving sentences containing the pairs from scientific databases. To assess the usefulness of the proposed system, this paper applies it into constructing a genic interaction dataset related to Alzheimer’s disease domain, which extracts 3,510 interaction-related sentences by using 140 gene names in the area. In conclusion, the resulting outputs of the case study performed in this paper indicate the fact that the system and process could highly boost the efficiency of the dataset construction in various subfields of biomedical research.

Citation status

* References for papers published after 2023 are currently being built.