Plagiarism Index Estimation Algorithm: A Quantitative Approach

Monday O. Eze; Shakeel A. Kamboh

Call for Paper

February Edition

IJAIS solicits high quality original research papers for the upcoming February edition of the journal. The last date of research paper submission is 28 January 2026

Submit your paper

Know more

The week's pick

Exploring Search-Based Applications in the Software Development Life Cycle: A Literature Review

Abeer Alarainy Nora Madi Aljawharah Al-Muaythir Abir Benabid Najjar

Random Articles

Reseach Article

Plagiarism Index Estimation Algorithm: A Quantitative Approach

by Monday O. Eze, Shakeel A. Kamboh

International Journal of Applied Information Systems

Foundation of Computer Science (FCS), NY, USA

Volume 8 - Number 4

Year of Publication: 2015

Authors: Monday O. Eze, Shakeel A. Kamboh

10.5120/ijais15-451307

Monday O. Eze, Shakeel A. Kamboh . Plagiarism Index Estimation Algorithm: A Quantitative Approach. International Journal of Applied Information Systems. 8, 4 ( February 2015), 36-46. DOI=10.5120/ijais15-451307

@article{ 10.5120/ijais15-451307,

author = { Monday O. Eze, Shakeel A. Kamboh },

title = { Plagiarism Index Estimation Algorithm: A Quantitative Approach },

journal = { International Journal of Applied Information Systems },

issue_date = { February 2015 },

volume = { 8 },

number = { 4 },

month = { February },

year = { 2015 },

issn = { 2249-0868 },

pages = { 36-46 },

numpages = {9},

url = { https://www.ijais.org/archives/volume8/number4/723-1307/ },

doi = { 10.5120/ijais15-451307 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2023-07-05T18:59:00.778129+05:30

%A Monday O. Eze

%A Shakeel A. Kamboh

%T Plagiarism Index Estimation Algorithm: A Quantitative Approach

%J International Journal of Applied Information Systems

%@ 2249-0868

%V 8

%N 4

%P 36-46

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Plagiarism has remained a serious setback especially in the academia. It is a major source of intellectual theft since it gives credits for scientific innovations to those who do not merit them. A number of efforts have been made by researchers to tackle plagiarism. However, one perceived research gap is the need to evolve verifiable computational techniques for detecting and quantifying the degree of plagiarism in digitized documents. This current research tackles this problem through a specialized plagiarism detection and quantification algorithm. It begins with a bi-partitioned search operation known as F-Search. This is followed by a purge operation which excludes the plagiarized sections discovered during the initial pass, thus giving rise to a fresh search space. The resulting search space is passed through a more thorough search operation known as T-Search. At this stage, the algorithm deals with specific plagiarism hiding tricks termed as whitespace flooding. The final output is a statistic known as the Plagiarism Index, which is a numeric value in the range [0, 1] for estimating the degree of plagiarism. The scope of this research covers the text domain. Each experimental dataset is made up of a set of two documents designed in such a way that one is assumed as the original document, while the second as a plagiarized copy. The system is designed and implemented in MATLAB.

References

Green, S. P. 2002. Plagiarism, Norms, and the Limits of Theft Law: Some Observations on the Use of Criminal Sanctions in Enforcing Intellectual Property Rights. Hastings Law Journal, Vol. 54, No. 1.
Ashish,M. & Sasmita,M. 2012. Student Plagiarism in Higher Education: An Enigma or An Intellectual Crime, Vol 1, Issue 1, p90-100.
Cambridge Univ. Press. 2008. Cambridge Advanced Learner's Dictionary, Cambridge University Press, The Edinburgh Building, Cambridge UK.
Tri Le, A. C. , Judy, S. , Margot, S. , Michael, D. and Chris, J. 2013. Educating Computer Programming Students about Plagiarism through Use of a Code Similarity Detection Tool. LATICE, 2013, Learning and Teaching in Computing and Engineering (LaTiCE), pp. 98-105.
Poongodi , D. and Tholkkappia, G. A. 2013. An Automatic Method for Statement Level Plagiarism Detection in Source Code Using Abstract Syntax Tree. Int. J. of Adv. Research in Comp. & Comm. Engineering. , Vol. 2, Issue 4, pp 1923-1938
Kashkur, M. , Parshutin,S. & Arkady, B. 2010. Research into Plagiarism Cases and Plagiarism Detection Methods. Scientific Journal of Riga Tech. Univ. , Vol 44, p139-144.
Mason, P. R. 2009. Plagiarism in Scientific Publications, J Infect Developing Countries, Vol. 3(1), p1-4.
Bradley,T. 2010. Student Plagiarism and the Use of Plagiarism Detection Tool by Community College Faculty (a PhD Dissertation), Department of Educational Leadership, Indiana State University, Indiana.
Batane, T. 2010. Turning to Turnitin to Fight Plagiarism among University Students. Educational Technology & Society, 13 (2), p1-12
Howard, R. M. 2007. Understanding Internet plagiarism, Computers and Composition, Vol. 24, p3–15.
Melton, T. D. , and Carmen, L. M. 2008. Plagiarism, Encyclopedia of the Social and Cultural Foundations of Education. Thousand Oaks, CA: SAGE 2008. p590-91
Bin-Habtoor, A. S and Zaher, M. A. 2012. A Survey on Plagiarism Detection Systems. Int. Journal of Comp. Theory and Eng. Vol. 4, No. 2, p185-188
Verco, K. L. and Wise, M. J. 2005. A comparison of automated systems for detecting suspected plagiarism, The Computer Journal.
Efstathios, S. 2011. Plagiarism Detection Based on Structural Information, Dept. of Inf. and Communication Systems Eng. , Univ. of the Aegean, Greece.
Potthast, M. , Barrón-Cedeño, A. , Eiselt, A. , Stein, B. , and Rosso, P. 2010. Overview of the 2nd international competition on plagiarism detection. In Proceedings of the 4th Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse.
Schleimer, S. , Wilkerson, D. S. , and Aiken, A. 2003. Winnowing: Local algorithms for document fingerprinting. In Proceedings of the ACM SIGMOD International Conference on Management of Data, 76-85.
Kupers, R. , & Conrad, S. 2012. A Set-Based Approach to Plagiarism Detection. Notebook for PAN at CLEF 2012
Hage, J. , Rademaker, P. and Vugt, N. 2010. A comparison of plagiarism detection tools. Technical Report UU-CS-2010-015, Department of Information and Computing Sciences Utrecht University, Utrecht, The Netherlands.
Cosma, G. 2008. An Approach to Source-Code Plagiarism Detection and Investigation Using Latent Semantic Analysis (a PhD Thesis), University of Warwick, Department of Computer Science.
Izzat, A. , and Zakaria, I. S. 2012. Documents Similarities Algorithms for Research Papers Authenticity. Proceedings of Int. Conf. on Com. & Info Tech (ICCIT 2012) Hammamet, Tunisia. June 26-28, 2012, p210-214
Pataki, M. 2003. Plagiarism Detection and Document Chunking Methods. Computer and Automation Research Institute, Hungarian Academy of Sciences
Won, K. J. , Choi, K. , Yo, S. and Kim. J. 2013. A Study of Design and Implementation of Korean Plagiarism Detection System. International Journal of Software Eng. & Its Appl. , Vol. 7, No. 1, p211-220
Mohamed, E. B. M. 2012. Detection of Plagiarism in Arabic Documents, Int. J. Inf Tech & Computer Science, p80-89,
Jeong-II, P. , Sang-Wook, K. and Miyoung, S. 2005. Music Plagiarism Detection Using Melody Databases. , Knowledge- based intelligent info and eng systems lecture notes in comp science, Vol 3683, p684-693
Asako, O. and Hajime, M. 2011. A Two-Step In-Class Source Code Plagiarism Detection Method Utilizing Improved CM Algorithm and SIM. International Journal of Innov. Computing, Info. & Control, Vol 7, No 8, Aug 2011, p4729-4739
Salha, A. , Naomie, S. , and Ajith, A. 2012. Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods. , IEEE Transactions on Systems, Man, and Cybernetics – Part C: Applications and Reviews, Vol. 42, No. 2, p133-149.
Leilei, K. , Zhimao, L. , Haoliang, Q. , and Zhongyuan, H. 2014. Detecting High Obfuscation Plagiarism: Exploring Multi-Features Fusion via Machine Learning. , Int. Journal of U & E-Service, Sc & Tech. , Vol. 7, No. 4. pp. 385-396.

Index Terms

Computer Science

Information Sciences

Keywords

Plagiarism Index Cell Array Plagiarism Quantification Bi-Partition.