International Journal of Applied Information Systems |
Foundation of Computer Science (FCS), NY, USA |
Volume 5 - Number 2 |
Year of Publication: 2013 |
Authors: Shanthini. A, Chandrasekaran. R. M |
10.5120/ijais12-450853 |
Shanthini. A, Chandrasekaran. R. M . Effect of Ensemble Methods for Software Fault Prediction at Various Metrics Level. International Journal of Applied Information Systems. 5, 2 ( January 2013), 51-55. DOI=10.5120/ijais12-450853
Defective modules in software project have a considerable risk. It reduces the software quality. Defective modules decreases customer satisfaction and by increases the development and maintenance costs. In software development life cycle, it is very essential to predict the defective modules in the early stage so as to improve software developers' ability to focus on the quality of the software. Software defect prediction using machine learning algorithms was investigated by many researchers and concluded that classifiers ensemble can effectively improve classification performance than a single classifier. This paper mainly addresses the software fault prediction using ensemble approaches. We conduct a comparative study using WEKA tool for various ensemble methods with perspective of taxonomy. The ensemble methods include Bagging, Boosting, Stacking, and Voting. We also compared these ensemble methods for three different levels of software metrics (Class level, Method level and Package level). Ensemble classifiers were examined for various metrics level datasets. Various ensemble classifiers were examined for three different metrics levels. The experiments were carried out on the datasets such as NASA KC1 method level data set, NASA KC1 class level dataset and Eclipse dataset for package level metrics. The experiments conducted on these three data sets by applying ensemble classification methods to predict defect. The ensemble methods evolved by experiments shows that bagging performs better than other ensemble methods for method level and package level dataset. For class level dataset voting performs better in terms of Area under ROC curve (AUC – ROC).