International Journal of Applied Information Systems |
Foundation of Computer Science (FCS), NY, USA |
Volume 7 - Number 2 |
Year of Publication: 2014 |
Authors: Tanu Verma, Renu, Deepti Gaur |
10.5120/ijais14-451139 |
Tanu Verma, Renu, Deepti Gaur . Tokenization and Filtering Process in RapidMiner. International Journal of Applied Information Systems. 7, 2 ( April 2014), 16-18. DOI=10.5120/ijais14-451139
Text mining is defined as a knowledge-intensive process in which a user interacts with a document collection. As in data mining[2,4,9], text mining seeks to extract useful information from data sources through the identi?cation and exploration of interesting patterns. A key element of text mining is its focus on the document collection. A document collection can be any grouping of text-based documents. Most text mining solutions are aimed at discovering patterns across very large document collections. The number of documents can range from the many thousands to millions. In this paper, we will see how text mining is implemented in Rapidminer.