CFP last date
15 April 2024
Reseach Article

MCAIM: Modified CAIM Discretization Algorithm for Classification

by Shivani V. Vora, R. G. Mehta
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 3 - Number 5
Year of Publication: 2012
Authors: Shivani V. Vora, R. G. Mehta
10.5120/ijais12-450542

Shivani V. Vora, R. G. Mehta . MCAIM: Modified CAIM Discretization Algorithm for Classification. International Journal of Applied Information Systems. 3, 5 ( July 2012), 42-50. DOI=10.5120/ijais12-450542

@article{ 10.5120/ijais12-450542,
author = { Shivani V. Vora, R. G. Mehta },
title = { MCAIM: Modified CAIM Discretization Algorithm for Classification },
journal = { International Journal of Applied Information Systems },
issue_date = { July 2012 },
volume = { 3 },
number = { 5 },
month = { July },
year = { 2012 },
issn = { 2249-0868 },
pages = { 42-50 },
numpages = {9},
url = { https://www.ijais.org/archives/volume3/number5/230-0542/ },
doi = { 10.5120/ijais12-450542 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T10:45:48.846491+05:30
%A Shivani V. Vora
%A R. G. Mehta
%T MCAIM: Modified CAIM Discretization Algorithm for Classification
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 3
%N 5
%P 42-50
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Discretization is a process of dividing a continuous attribute into a finite set of intervals to generate an attribute with small number of distinct values, by associating discrete numerical value with each of the generated intervals. Discretization is usually performed prior to the learning process and has played an important role in data mining and knowledge discovery. The results of CAIM are not satisfactory in some cases, led us to modify the algorithm. The Modified CAIM (MCAIM) results are compared with other discretization techniques for classification accuracy and generated the outperforming results. The intervals generated by MCAIM discretization are more in numbers, so to reduce them, the CAIR criterion is used to merge the intervals in MCAIM discretization. It gives better classification accuracy and the reduced number of intervals.

References
  1. Jiawei Han and Micheline Kamber, Data Mining –Concept and Techniques, Elsevier: Second Edition
  2. D. P. Rana, R. G Mehta, M. A Zaveri, 2008. Hash based Pattern Discovery Algorithm for Web Usage Mining, ADIT Journal of Engineering, ISSN: 0973 3663, vol. 5, No. 1, (December 2008), pp No. 3-10
  3. Cheng-Jung Tsai, Chien-I. Lee, Wei-Pang Yang, 2007. A discretization algorithm based on Class-Attribute Contingency Coefficient; Elsevier; sciencedirect; Received 27 September 2006; received in revised form 24 August 2007, accepted 2 September 2007
  4. Q. Wu, D. A. Bell, T. M. McGinnity, G. Prasad, G. Qi, X. Huang, 2006. Improvement of decision accuracy using discretization of continuous attributes, in: Proceedings of the Third International Conference on Fuzzy Systems and Knowledge Discovery, Lecture Notes in Computer Science 4223, pp. 674–683
  5. Lukasz A. Kurgan, Member, IEEE, and Krzysztof J. Cios, Senior Member, IEEE, 2004. CAIM Discretization Algorithm; IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 2
  6. R. G Mehta, 2009. A Novel Fuzzy Based Classification algorithm for Data Mining using Fuzzy Discretization" World congress on Computer Science and Information Engineering (CSIE-2009), Sponsored by IEEE, Los Angeles, USA
  7. K. J. Cios, W. Pedrycz and R. Swiniarski, 1998. Data Mining Methods for Knowledge Discovery, Kluwer, http://www. wkap. nl/ book. htm/0-7923-8252-8
  8. J. Ross Quinlan, 1993. C4. 5: programs for machine learning, Morgan Kaufmann Publishers Inc
  9. http://archive. ics. uci. edu/ml/datasets. html
  10. S. Cohen, L. Rokach, O. Maimon, 2007. Decision-tree instance-space decomposition with grouped gain-ratio, Information Sciences, pp. 3592–3612
  11. Breiman L. , Friedman, J. H. , Olshen R. A. , and Stone C. J. , 1984. Classification and Regression Trees, Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software.
  12. David HeckerMann, A Tutorial On Learning With Bayesian Networks, March 1995 (Revised November 1996)
  13. Raul Rojas, 1996. Neural Networks - A Systematic Introduction, Springer-Verlag
  14. Cover, T. , Hart, P. , 1967. Nearest neighbor pattern classification, IEEE Trans. on Information Theory, vol. 13, no. 1,pp. 21–7
  15. Shivani V. Vora and Rupa G. Mehta, "Classification techniques for environmental data: A survey", International Congress of Environment Research (ICER-11), SVNIT, Surat, Dec 15-17, 2011.
  16. R. Rastogi, K. Shim, A decision tree classifier that integrates building and pruning, Proc. of the twenty forth Int'l Conf. on Very Large Databases, (1998) , pp. 404–415
  17. H. Liu, F. Hussain, C. L. Tan, M. Dash, 2002. Discretization: an enabling technique, Journal of Data Mining and Knowledge Discovery 6(4) 393–423
  18. M. Boulle, Khiops, A statistical discretization method of continuous attributes, Machine Learning 55 (1) (2004) 53–69
  19. J. Dougherty, R. Kohavi, M. Sahami, Supervised and unsupervised discretization of continuous features, in: Proceeding of Twelfth International Conference on Machine Learning, 1995, pp. 194–202
  20. U. M. Fayyad, K. B. Irani, On the handling of continuous-valued attributes in decision tree generation, Machine Learning 8 (1992) 87– 102
  21. Shivani Vora and Rupa G. Mehta, 2011. MCAIM: modified CAIM discretization, International Journal of computer Science and Engineering, Vol. 8, Issue 1, pp. 16-20 ISSN (online): 2043-9091
  22. Catlett, J. 1991. On changing continuous attributes into ordered discrete attributes. In proc. of fifth European working session on learning. Berlin: Springer-Verlag, pp. 164–177
  23. Michalski, R. S. , Chilausky, R. L. , 1980. Learning by being told and learning from examples: an experimental comparison of the two methods of knowledge acquisition in the context of developing and expert system for soybean disease diagnosis, Policy Analysis and Information Systems
  24. Y. Linde, A. Buzo, R. M. Gray, 1980. An Algorithm for Vector Quantizer Design, IEEE Trans. Comm. , vol. 28, no. 1, pp. 84-95
Index Terms

Computer Science
Information Sciences

Keywords

Discretization Class-attribute interdependency maximization CAIM MCAIM CAIR