International Journal of Applied Information Systems |
Foundation of Computer Science (FCS), NY, USA |
Volume 9 - Number 4 |
Year of Publication: 2015 |
Authors: Odunayo Esther Oduntan, Ibrahim Adepoju Adeyanju, Stephen Olatunde Olabiyisi, Elijah Olusayo Omidiora |
10.5120/ijais15-451394 |
Odunayo Esther Oduntan, Ibrahim Adepoju Adeyanju, Stephen Olatunde Olabiyisi, Elijah Olusayo Omidiora . Evaluation of N-gram Text Representations for Automated Essay-Type Grading Systems. International Journal of Applied Information Systems. 9, 4 ( July 2015), 25-31. DOI=10.5120/ijais15-451394
Automated grading systems can reduce stress and time constraints faced by examiners especially where large numbers of students are enrolled. Essay-type grading involves a comparison of the textual content of a student's script with the marking guide of the examiner. In this paper, we focus on analyzing the n-gram text representation used in automated essay-type grading system. Each question answered in a student script or in the marking guide is viewed as a document in the document term matrix. Three n-gram representation schemes were used to denote a term vis-à-vis unigram 1-gram, bigram 2-gram and both )"(unigram )+( bi-gram)"(. A binary weighting scheme was used for each document vector with cosine similarity to compare documents across the student scripts and marking guide. The final student score is computed as a weighted aggregate of documents' similarity scores as determined by marks allocated to each question in the marking guide. Our experiment compared effectiveness of the three representation schemes using electronically transcribed handwritten students' scripts and marking guide from a first year computer science course of a Nigerian Polytechnic. The machine generated scores were then compared with those provided by the Examiner for the same scripts using mean absolute error and Pearson correlation coefficient. Experimental results indicate )"(unigram )+( bigram)" representation outperformed the other two representations with a mean absolute error of 7. 6 as opposed to 15. 8 and 10. 6 for unigram and bigram representations respectively. These results are reinforced by the correlation coefficient with "unigram + bigram" representation having 0. 3 while unigram and bigram representations had 0. 2 and 0. 1 respectively. The weak but positive correlation indicates that the Examiner might have considered other issues not necessarily documented in the marking guide. We intend to test other datasets and apply techniques for reducing sparseness in our document term matrices to improve performance.