An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

Amanpreet Kaur Toor; Amarpreet Singh

Call for Paper

May Edition

IJAIS solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 28 April 2026

Submit your paper

Know more

The week's pick

Optimized Decision Tree Classifier for Data Aggregation in Wireless Sensor Networks Using IoT Sensor Data

Jagan Kurma Raghuvaran Kendyala Varun Bitkuri Avinash Attipalli Jaya Vardhani Mamidala Sunil Jacob Enokkaren

Random Articles

Reseach Article

An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

by Amanpreet Kaur Toor, Amarpreet Singh

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 7 - Number 2

Year of Publication: 2014

Authors: Amanpreet Kaur Toor, Amarpreet Singh

10.5120/ijais14-451136

Amanpreet Kaur Toor, Amarpreet Singh . An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality. International Journal of Applied Information Systems. 7, 2 ( April 2014), 5-9. DOI=10.5120/ijais14-451136

@article{ 10.5120/ijais14-451136,

author = { Amanpreet Kaur Toor, Amarpreet Singh },

title = { An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality },

journal = { International Journal of Applied Information Systems },

issue_date = { April 2014 },

volume = { 7 },

number = { 2 },

month = { April },

year = { 2014 },

issn = { 2249-0868 },

pages = { 5-9 },

numpages = {9},

url = { https://www.ijais.org/archives/volume7/number2/618-1136/ },

doi = { 10.5120/ijais14-451136 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T18:54:36.461349+05:30

%A Amanpreet Kaur Toor

%A Amarpreet Singh

%T An Advanced Clustering Algorithm (ACA) for Clustering Large Data Set to Achieve High Dimensionality

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 7

%N 2

%P 5-9

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The cluster analysis method is one of the critical methods in data mining; this method of clustering algorithm will manipulate the clustering results directly. This paper proposes an Advanced Clustering Algorithm in order to addresses the concern of high dimensionality and large data set [1]. The Advanced Clustering Algorithm method avoids computing the distance of each data object to the cluster recursively and save the execution time. ACA requires a simple data structure to store information in each iteration, which is to be used in the next iteration. Experimental results show that the Advanced Clustering Algorithm method can effectively improve the speed of clustering and accuracy, reducing the computational complexity of the traditional algorithm Kohonen SOM. This paper includes Advanced Clustering Algorithm (ACA) and its simulated experimental results with different data sets.

References

Yuan F, Meng Z. H, Zhang H. X and Dong C. R, "A New Algorithm to Get the Initial Centroids," Proc. of the 3rd International Conference on Machine Learning and Cybernetics, pp. 26–29, August 2004.
Sun Jigui, Liu Jie, Zhao Lianyu, "Clustering algorithms Research",Journal of Software ,Vol 19,No 1, pp. 48-61,January 2008.
Amanpreet Kaur Toor, Amarpreet Singh, " Analysis of Clustering Algorithm based on Number of Clusters, error rate, Computation Time and Map Topology on large Data Set", International Journal of Emerging Trends & Technology in Computer Science (IJETTCS) Volume 2, Issue 6, November- December 2013.
Amanpreet Kaur Toor, Amarpreet Singh, " A Survey paper on recent clustering approaches in data mining", International Journal of Advanced Research in Computer Science and Software Engineering Vol 3, Issue 11, November 2013.
Sun Shibao, Qin Keyun," Research on Modified K-means Data Cluster Algorithm"I. S. Jacobs and C. P. Bean, "Fine particles, thin films and exchange anisotropy," Computer Engineering, vol. 33, No. 13, pp. 200– 201,July 2007.
Merz C and Murphy P, UCI Repository of Machine Learning Databases, Available: ftp://ftp. ics. uci. edu/pub/machine-learning-databases
Fahim A M,Salem A M,Torkey F A, "An efficient enhanced k-means clustering algorithm" Journal of Zhejiang University Science A, Vol. 10, pp:1626-1633,July 2006.
Zhao YC, Song J. GDILC: A grid-based density isoline clustering algorithm. In: Zhong YX, Cui S, Yang Y, eds. Proc. of theInternet Conf. on Info-Net. Beijing: IEEE Press,2001. 140?145. http://ieeexplore. ieee. org/iel5/7719/21161/00982709. pdf
Huang Z, "Extensions to the k-means algorithm for clustering large data sets with categorical values," Data Mining and Knowledge Discovery, Vol. 2, pp:283–304, 1998.
K. A. AbdulNazeer, M. P. Sebastian, "Improving the Accuracy and Efficiency of the k-means Clustering Algorithm",Proceeding of the World Congress on Engineering, vol 1,london, July 2009.
Fred ALN, Leitão JMN. Partitionalvs hierarchical clustering using a minimum grammar complexity approach. In: Proc. of the SSPR & SPR 2000. LNCS 1876, 2000. 193?202. http://www. sigmod. org/dblp/db/conf/sspr/sspr2000. htm
Gelbard R, Spiegler I. Hempel's raven paradox: A positive approach to cluster analysis. Computers and Operations Research, 2000,27(4):305?320.
Huang Z. A fast clustering algorithm to cluster very large categorical data sets in data mining. In: Proc. of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery. Tucson, 1997. 146?151.
Ding C, He X. K-Nearest-Neighbor in data clustering: Incorporating local information into global optimization. In: Proc. of the ACM Symp. on Applied Computing. Nicosia: ACM Press, 2004. 584?589. http://www. acm. org/conferences/sac/sac2004/
HinneburgA,KeimD. An efficient approach to clustering in large multimedia databases with noise. In:AgrawalR,StolorzPE,Piatetsky- Shapiro G,eds. Proc. of the 4th Int'l Conf. on Knowledge Discovery and Data Mining(KDD'98). New York:AAAIPress,1998. 58~65.
ZhangT,RamakrishnanR,LivnyM. BIRCH:An efficient data clustering method for very large databases. In:JagadishHV,MumickIS,eds. Proc. of the 1996 ACM SIGMOD Int'l Conf. on Management of Data. Montreal:ACM Press,1996. 103~114.
Birant D, Kut A. ST-DBSCAN: An algorithm for clustering spatial- temporal data. Data & Knowledge Engineering, 2007,60(1): 208-221.

Index Terms

Computer Science

Information Sciences

Keywords

ACA SOM Clustering Large Data Set High Dimensionality Cluster Analysis