[1911.01914v1] A Comparative Analysis of XGBoost
The results of this study show that the most accurate classifier, in terms of the number of problems with the best performance in the problems investigated, was gradient boosting
Abstract XGBoost is a scalable ensemble technique based on gradient boosting that has demonstrated to be a reliable and efficient machine learning challenge solver. This work proposes a practical analysis of how this novel technique works in terms of training speed, generalization performance and parameter setup. In addition, a comprehensive comparison between XGBoost, random forests and gradient boosting has been performed using carefully tuned models as well as using the default settings. The results of this comparison may indicate that XGBoost is not necessarily the best choice under all circumstances. Finally an extensive analysis of XGBoost parametrization tuning process is carried out.
‹Figure 1: Average ranks (a higher rank is better) for the tested methods across 28 datasets (Critical difference CD= 1.42) (Results)Figure 2: Average ranks (higher rank is better) for different XGBoost configurations (Critical difference CD= 1.15) (Analysis of XGBoost parametrization)›