Database Clustering using Intelligent Techniques

Abstract

Owning to the huge amounts of data collected in database, cluster analysis has recently become a highly active topic in data mining research. In data mining, efforts have focused on finding methods for efficient and effective cluster analysis in large database.This paper proposes two new partitioning cluster methods, first is modified k-mean clustering algorithm with variable Neighborhood Search as a metaheuristic search and the second is modified k-mean clustering algorithm with cuckoo search as swarm intelligence.The proposed algorithms does not need to enter the value of cluster points, instead of that it finds it automatically to get the best clustering using the clustering validity. This represents its fundamental characteristic.The experiments were made on a many different sizes of databases some of the obtained from University of California (UC) Irvine Machine Learning Repository which maintain 246 data sets as a service to the machine learning community.From these experiments, it is concluded that these methods reduced the time which needed to get the best solution as a half time which needed to perform same actions and in the same time it reduced the iterations to get the best solution. In addition, these proposed clustering methods give best quality (as performance) compared with other clustering methods; the performance was improved between (10% - 20%) compared with the original k-mean clustering method.