CFP last date
16 December 2024
Reseach Article

Query Word Image based Retrieval Scheme for Handwritten Tamil Documents

by AN. Sigappi, S. Palanivel
International Journal of Applied Information Systems
Foundation of Computer Science (FCS), NY, USA
Volume 1 - Number 1
Year of Publication: 2012
Authors: AN. Sigappi, S. Palanivel
10.5120/ijais12-450717

AN. Sigappi, S. Palanivel . Query Word Image based Retrieval Scheme for Handwritten Tamil Documents. International Journal of Applied Information Systems. 1, 1 ( November 2012), 1-5. DOI=10.5120/ijais12-450717

@article{ 10.5120/ijais12-450717,
author = { AN. Sigappi, S. Palanivel },
title = { Query Word Image based Retrieval Scheme for Handwritten Tamil Documents },
journal = { International Journal of Applied Information Systems },
issue_date = { November 2012 },
volume = { 1 },
number = { 1 },
month = { November },
year = { 2012 },
issn = { 2249-0868 },
pages = { 1-5 },
numpages = {9},
url = { https://www.ijais.org/archives/volume1/number1/306-0717/ },
doi = { 10.5120/ijais12-450717 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2023-07-05T10:40:56.401536+05:30
%A AN. Sigappi
%A S. Palanivel
%T Query Word Image based Retrieval Scheme for Handwritten Tamil Documents
%J International Journal of Applied Information Systems
%@ 2249-0868
%V 1
%N 1
%P 1-5
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper brings out an autoassociative neural network (AANN) based information retrieval mechanism to locate handwritten documents from a literary collection in Tamil language corresponding to query word images. The strategy extends to create models for the chosen search word images, evolve a methodology to identify the search word and subsequently retrieve the relevant documents. AANN emphasises a training procedure through an appropriate combination of units in the layers of the network to arrive at a suitable model for each word in the vocabulary. The training phase orients to segment the digitized text documents into lines and words, extract profile and moment based features from the words and articulate an index of words. The features computed based on the intensity values of the pixels cater to accrue the nuances of the strokes in the characters. The experimental results obtained for an index of words elaborate the astuteness of the scheme and its retrieval accuracy.

References
  1. Guy Desjardins, Robert Proulx, and Robert Godin. An auto-associative neural network for information retrieval. In International Joint Conference on Neural Networks, IJCNN 2006, part of the IEEE World Congress on Computational Intelligence, pages 3492–3498, July 2006.
  2. Rafael C. Gonzalez and Richard E. Woods. Digital Image Processing. Prentice Hall, 2008.
  3. Emanuel Indermhle, Marcus Eichenberger-Liwicki, and Horst Bunke. Recognition of handwritten historical documents: HMM-adaptation vs. writer specific training. In 11th Int. Conference on Frontiers in Handwriting Recognition, pages 186–191, 2008.
  4. Shashi Kiran, Kolli Sai Prasada, Rituraj Kunwar, and A. G. Ramakrishnan. Comparison of HMM and SDTW for tamil handwritten character recognition. In IEEE International Conference On Signal Processing and Communications, pages 1–4, 2010.
  5. T. Konidaris, B. Gatos, K. Ntzios, I. Pratikakis, S. Theodoridis, and S. J. Perantonis. Keyword-guided word spotting in historical printed doucments using synthetic data and user feedback. Intl. Journal on Document Analysis and Recognition, 9(2):167–177, April 2007.
  6. Mark A. Kramer. Nonlinear principal component analysis using autoassociative neural networks. AIChE Journal, 37(2):233–243, February 1991.
  7. Nikos Nikolaou, Michael Makridis, Basilis Gatos, Nikolaos Stamatopoulos, and Nikos Papamarkos. Segmentation of historical machine-printed documents using adaptive run length smoothing and skeleton segmentation paths. Image and Vision Computing, 28(4):590–604, 2010.
  8. William H. Press, Saul A. Teukolsky,William T. Vetterling, and Brian P. Flannery. Numerical recipes in C. Cambridge University Press, 2002.
  9. Toni M. Rath and R. Manmatha. Features for word spotting in historical manuscripts. In ICDAR, pages 218–222, 2003.
  10. Lenka Skovajsova. Text document retrieval by feed forward neural networks. Information Sciences and Technologies Bulletin of the ACM Slovakia, 2(2):70–78, 2010.
  11. Yong Haur Tay, Pierre Michel Lallican, Marzuki Khalid, Stefan Knerr, and Christian Viard-Gaudin. An analytical handwritten word recognition system with word-level discriminant training. In ICDAR, pages 726–730, 2001.
  12. O Due Trier, Anil K. Jain, and Torfin Taxt. Feature extraction methods for character recognition: A survey. Pattern Recognition, 29(4):641–662, Feb 1996.
  13. B. Yegnanarayana and S. P. Kishore. AANN: an alternative to GMM for pattern recognition. Neural Networks, 15(3):459–469, April 2002.
  14. Konstantinos Zagoris, Kavallieratou Ergina, and Nikos Papamarkos. A document image retrieval system. Engineering Applications of Artificial Intelligence, 23(6):872–879, Sept 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Segmentation profile features moment based features autoassociative neural networks