International Journal of Applied Information Systems |
Foundation of Computer Science (FCS), NY, USA |
Volume 2 - Number 3 |
Year of Publication: 2012 |
Authors: Madhusmita Mishra, H.s. Behera |
10.5120/ijais12-450310 |
Madhusmita Mishra, H.s. Behera . Kohonen Self Organizing Map with Modified K-means clustering For High Dimensional Data Set. International Journal of Applied Information Systems. 2, 3 ( May 2012), 34-39. DOI=10.5120/ijais12-450310
Since it was first proposed, it is amazing to notice how K-Means algorithm has survive over the years. It has been one among the well known algorithms for data clustering in the field of data mining. Day in and day out new algorithms are evolving for data clustering purposes but none can be as fast and accurate as the K-Means algorithm. But in spite of its huge speed, accuracy and simplicity K-Means has suffered from some of its own problem. Such as, the exact number of cluster is not known prior to clustering. The other thing that is causing problem is that it is quite sensitive to initial centroids. Not just that, K-Means fails to give optimum result when it comes to clustering high dimensional data set because its complexity tends to make things more complicated when more number of dimensions are added. In Data Mining this problem is known as "Curse of High Dimensionality". Here in our paper we proposed a new Modified K-Means algorithm that will overcome the problem faced by the standard K-Means algorithm. We proposed the use of Kohonen Self Organizing Map (KSOM) so as to visualize exact number of clusters before clustering and genetic algorithm is applied for initialization. The Kohonen Self Organizing Map (KSOM) with Modified K-Means algorithm is tested on an iris data set and its performance is compared with other clustering algorithm and is found out to be more accurate, with less number of classification and quantization errors and can be applied even for high dimensional dataset.