본문 바로가기
  • Home

Searching Sequential Patterns by Approximation Algorithm

  • Journal of The Korea Society of Computer and Information
  • Abbr : JKSCI
  • 2009, 14(5), pp.29-36
  • Publisher : The Korean Society Of Computer And Information
  • Research Area : Engineering > Computer Science

Sansarbold Garamragchaa 1 황영섭 1

1선문대학교

Accredited

ABSTRACT

Sequential pattern mining, which discovers frequent subsequences as patterns in a sequence database, is an important data mining problem with broad applications. Since a sequential pattern in DNA sequences can be a motif, we studied to find sequential patterns in DNA sequences. Most previously proposed mining algorithms follow the exact matching with a sequential pattern definition. They are not able to work in noisy environments and inaccurate data in practice. Theses problems occurs frequently in DNA sequences which is a biological data. We investigated approximate matching method to deal with those cases. Our idea is based on the observation that all occurrences of a frequent pattern can be classified into groups, which we call approximated pattern. The existing PrefixSpan algorithm can successfully find sequential patterns in a long sequence. We improved the PrefixSpan algorithm to find approximate sequential patterns. The experimental results showed that the number of repeats from the proposed method was 5 times more than that of PrefixSpan when the pattern length is 4.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.