A Fast Deterministic Kmeans Initialization

Omar Kettani; Faical Ramdani

Call for Paper

August Edition

IJAIS solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 28 July 2025

Submit your paper

Know more

The week's pick

Enhancing Financial Time Series Predictions with a Hybrid BNN-LSTM Approach

Anika Tahsin Biva A.B.M. Shahadat Hossain Md. Shafiul Alom Khan Iqbal Habib

Random Articles

Application of Data Mining and Knowledge Management for Business Improvement: An Exploratory Study

February

2015

Building Trust for Web Services Security Patterns

July

2012

TCP Congestion Control through Bandwidth Estimation Mechanism in MANET

May

2012

Comparative Data On Docking Algorithms: Keeping the Update in the Field Knowledge

May

2012

Reseach Article

A Fast Deterministic Kmeans Initialization

by Omar Kettani, Faical Ramdani

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 12 - Number 2

Year of Publication: 2017

Authors: Omar Kettani, Faical Ramdani

10.5120/ijais2017451683

Omar Kettani, Faical Ramdani . A Fast Deterministic Kmeans Initialization. International Journal of Applied Information Systems. 12, 2 ( May 2017), 6-11. DOI=10.5120/ijais2017451683

@article{ 10.5120/ijais2017451683,

author = { Omar Kettani, Faical Ramdani },

title = { A Fast Deterministic Kmeans Initialization },

journal = { International Journal of Applied Information Systems },

issue_date = { May 2017 },

volume = { 12 },

number = { 2 },

month = { May },

year = { 2017 },

issn = { 2249-0868 },

pages = { 6-11 },

numpages = {9},

url = { https://www.ijais.org/archives/volume12/number2/984-2017451683/ },

doi = { 10.5120/ijais2017451683 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T19:07:56.282663+05:30

%A Omar Kettani

%A Faical Ramdani

%T A Fast Deterministic Kmeans Initialization

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 12

%N 2

%P 6-11

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz initialization algorithm for k-means, using both simulated and real data, and find it to be more efficient in many cases.

References

Aloise D., Deshpande A., Hansen P., Popat P.: NP-hardness of Euclidean sum-of- squares clustering. Machine Learning, 75, 245 - 249 (2009).
Lloyd., S. P. (1982). "Least squares quantization in PCM". IEEE Transactions on Information Theory 28 (2): 129–137. doi:10.1109/TIT.1982.1056489.
Peña J.M., Lozano J.A., Larrañaga P.: An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters, 20(10), 1027 - 1040 (1999).
4.. Forgy E., "Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications". Biometrics, 21, 768 - 769 (1965).
Arthur D., Vassilvitskii S.: k-means++: the advantages of careful seeding. In: Proceedings of the 18th annual ACM-SIAM Symp. on Disc. Alg, pp. 1027 - 1035 (2007).
Bahmani B., Moseley B., Vattani A., Kumar R., Vassilvitskii S.:Scalable K-means++. In: Proceedings of the VLDB Endowment (2012).
I. Katsavounidis, C.-C. J. Kuo, Z. Zhang, A New Initialization Technique for Generalized Lloyd Iteration, IEEE Signal Processing Letters 1 (10) (1994) 144–146.
8.Asuncion, A. and Newman, D.J. (2007). UCI Machine LearningRepository[http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, School of Information and Computer Science.
L. Kaufman and P. J. Rousseeuw. Finding groups in Data: “an Introduction to Cluster Analysis”. Wiley, 1990.

Index Terms

Computer Science

Information Sciences

Keywords

k-means initialization kkz