Although they were originally developed for prediction purposes, Random Forests (RFs) [1] have become a popular tool for assessing the relevance of predictor variables in predicting an outcome1. Rather than applying a RF merely as a black-box prediction algorithm, so-called variable importance measures have been proposed and implemented to obtain an importance ranking of the predictors in fitted RFs, or to identify (or recursively select) a set of important predictors (i.e., variable selection). This article mainly focuses on identifying and ranking the predictors that play a role in achieving the prediction accuracy of a fitted RF, in the spirit of interpretable machine learning. However, the methods discussed below can in principle also be applied in variable selection algorithms.
展开▼