A New Improved Hybridized K-MEANS Clustering Algorithm with Improved PCA Optimized with PSO for High Dimensional Data Set
H. S. Behera1, Abhishek Ghosh2, Sipak Ku. Mishra3

1H.S. Behera, Faculty in Dept. of Computer Science and Engineering is Veer Surendra Sai University of Technology(VSSUT), Burla, Odisha, India.
2Abhishek Ghosh, B. Tech. student in Dept. of Computer Science and Engineering, Veer Surendra Sai University of Technology (VSSUT), Burla, Odisha, India.
3Sipak ku. Mishra, B. Tech. student in Dept. of Computer Science and Engineering, Veer Surendra Sai University of Technology (VSSUT), Burla, Odisha, India.

Manuscript received on April 15, 2012. | Revised Manuscript received on April 20, 2012. | Manuscript published on May 05, 2012. | PP: 121-126 | Volume-2 Issue-2, May 2012 . | Retrieval Number: B0533042212/2012©BEIESP
Open Access | Ethics and Policies | Cite 
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The day to day computation has made the data sets and data objects to grow large so it has become important to cluster the data in order to reduce complexity to some extent. K-means clustering algorithm is an efficient clustering algorithm to cluster the data, but the problem with the k-means is that when the dimension of the data set becomes larger the effectiveness of k-means is lost. PCA algorithm is used with k-means to counter the dimensionality problem. However K-means with PCA does not give much optimisation. It can be experimentally seen that the optimisation of k-means gives more accurate results. So in this paper we have proposed a PSO optimised k-means algorithm with improved PCA for clustering high dimensional data set.

Keywords: Data mining, Clustering, Particle Component Analysis, Centred vector, Squared Sum Error, Lower bound, Bound Error, Particle Swarm Optimisation.