Published in:The Essentials of Machine Learning in Finance and Accounting, Routledge, Taylor & Francis
Abstract:Non-parametric methods have been proposed to capture complex relationships between variables in high-dimensional datasets with the aim of classifying or predicting variables of interest. Among non-parametric methods, decision trees and random forests became popular because they do not require any assumptions regarding the distribution of the dependent variable, the explanatory variables, and the functional form of the relationships between them. These methods partition the space of the explanatory variables and associate a value of the target variable to each element of the partition. Their output is easy to interpret and allows for making predictions conditionally on the values of the explanatory variables. This chapter reviews some important aspects of Breiman's CART algorithm and discusses some features of classification and regression trees and random forests. Some original applications to nowcasting of financial time series and to default prediction in small and medium-size enterprises are given.