MVLSC Home · Issue Contents · Forthcoming Papers
Improvement of Random Forest Ensembling Algorithm Efficiency Through Cardinal Tuning of n_ estimators Parameter
Gaurav Sandhu, Amandeep Singh, Parmiderpal Singh Bedi, Puneet Singh Lamba and Gopal Chaudhary
In today’s world multiple type of datasets are available, they are mainly divided into two type namely classification and regression; based on the dataset type machine learning models are applied. There is a technique called bagging in which multiple models are trained simultaneously and produce results on testing dataset. These results are then clubbed to vote in the majority of the same outputs produced. This output is then considered as final output. The bagging technique used in this study is Random Forest (RF) algorithm, where the multiple models are replaced with decision trees. The RF algorithm requires multiple parameters to produce optimum output. These parameters are n_ estimators, max_ samples, max_ features etc. Though RF algorithm produces favourable results with default values of parameters, but to improve the efficiency of RF algorithm tuning of parameters is preferred. In this study, parameter n_ estimators is tuned, on basis of the length of dataset to produce improved results as compared to the default parameters value. Further the proposed tuning method has been applied to RF algorithm and improvement of class prediction efficiency on different datasets is measured in terms of accuracy, precision, recall, F1-score. Also, implementation of ROC curves and auc is performed to depict the improvement of RF algorithm for datasets under consideration, for instance increase in auc values from 0.915 to 0.932 for Spine (2 Classes) dataset and 0.705 to 0.720 for Haberman’s survival dataset.
Keywords: Classification, regression, random forest algorithm, bagging, decision trees, n_ estimators, max_ samples, max_ features, accuracy, precision, recall, F1-score, ROC, auc