Uncovering Sparsity and Heterogeneity in Firm-Level Return Predictability Using Machine Learning

Theodoros Evgeniou, Ahmed Guecioueur, and Rodolfo Prieto

We develop an approach that combines the estimation of monthly firm-level expected returns with an assignment of firms to (possibly) latent groups, both based upon observable characteristics, using machine learning principles with linear models. The best performing methods are flexible two-stage sparse models that capture group-membership predictive relationships. Portfolios formed to exploit such group-varying predictions based on a parsimonious set of characteristics deliver economically meaningful returns with low turnover. We propose statistical tests based on nonparametric bootstrapping for our results, and detail how different characteristics may matter for different groups of firms, making comparisons to the existing literature.