본문 바로가기
  • Home

Selectivity Estimation Using Frequent Itemset Mining

  • Journal of Knowledge Information Technology and Systems
  • Abbr : JKITS
  • 2015, 10(1), pp.69-78
  • Publisher : Korea Knowledge Information Technology Society
  • Research Area : Interdisciplinary Studies > Interdisciplinary Research
  • Published : February 28, 2015

엄보윤 1 Christopher Jermaine 2 이춘화 3

1한국전자통신연구원
2Rice University
3한양대학교

Accredited

ABSTRACT

In query processing, query optimization is an important function of a database management system since overall query execution time can be significantly affected by the quality of the plan chosen by the query optimizer. Under cost-based optimization, a query optimizer estimates the cost for every possible query plans based on the underlying data distribution in synopses of database relations. The most common synopses in commercial databases have been histograms. However, when there is correlation among datum, one-dimensional histograms can provide poor estimation quality. Motivated by this, we propose a new approach to perform more accurate selectivity estimation, even for correlated data. To deal with the correlation that may exist among datum, we adopt well-known techniques in data mining and extract attribute values that occur together frequently using frequent itemsets mining. Through experimentation, we found that our approach is effective in modeling correlations and that this method approximates intermediate relations more accurately. In fact, it gives precise estimates, particularly for the correlated data.

Citation status

* References for papers published after 2023 are currently being built.

This paper was written with support from the National Research Foundation of Korea.