An Empirical Evaluation of Density-Based Clustering Techniques
Glory H. Shah1, C. K. Bhensdadia2, Amit P. Ganatra3

1Glory H. Shah, Computer Engineering at Dhramsinh Desai University, Nadiad and Assistant Professor at Charotar University of Science Technology (CHARUSAT), Education Campus, Changa, Gujarat, India.
2C. K. Bhensdadia, Professor & Head at Department of Computer Engineering, Faculty of Technology, Dharmsinh Desai University, Nadiad, Gujarat, India.
2Amit P. Ganatra, Associate Professor at Charotar University of Science Technology (CHARUSAT), Education Campus, Changa, Gujarat, India.

Manuscript received on February 15, 2012. | Revised Manuscript received on February 20, 2012. | Manuscript published on March 05, 2012. | PP: 216-223 | Volume-2 Issue-1, March 2012. | Retrieval Number: A0418022112 /2012©BEIESP
Open Access | Ethics and Policies | Cite 
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Emergence of modern techniques for scientific data collection has resulted in large scale accumulation of data pertaining to diverse fields. Conventional database querying methods are inadequate to extract useful information from huge data banks. Cluster analysis is one of the major data analysis methods. It is the art of detecting groups of similar objects in large data sets without having specified groups by means of explicit features. The problem of detecting clusters of points is challenging when the clusters are of different size, density and shape. The development of clustering algorithms has received a lot of attention in the last few years and many new clustering algorithms have been proposed. This paper gives a survey of density based clustering algorithms. DBSCAN [15] is a base algorithm for density based clustering techniques. One of the advantages of using these techniques is that method does not require the number of clusters to be given a prior nor do they make any kind of assumption concerning the density or the variance within the clusters that may exist in the data set. It can detect the clusters of different shapes and sizes from large amount of data which contains noise and outliers. OPTICS [14] on the other hand does not produce a clustering of a data set explicitly, but instead creates an augmented ordering of the database representing its density based clustering structure. This paper shows the comparison of two density based clustering methods i.e. DBSCAN [15] & OPTICS [14] based on essential parameters such as distance type, noise ratio as well as run time of simulations performed as well as number of clusters formed needed for a good clustering algorithm. We analyze the algorithms in terms of the parameters essential for creating meaningful clusters. Both the algorithms are tested using synthetic data sets for low as well as high dimensional data sets.

Keywords: DBSCAN, OPTICS, DENCLUE, Spatial Data, Intra Cluster, Inter Cluster.