Penalized regression models for Major League Baseball metrics
Panda, Mushimie Lona
MetadataShow full item record
Major League Baseball is a sport complete with a multitude of statistics to evaluate a player's performance and achievements. In recent years, traditional statistics are constantly being supplemented by more sophisticated modern metrics, to determine a player's predictive power. To address this issue, we use penalized regression models to determine which offensive and defensive metrics are consistent measures of a player's ability. Penalized linear regression techniques which have shrinkage and variable selection mechanisms, have been widely used to analyze high dimensional data. We use three popular regularized regression methods in our data analysis, Lasso, Elastic Net, and SCAD. We implement these regularized regression models on a set of thirty-one different offensive metrics and five defensive metrics. The results indicate that two defensive metrics stand out to distinguish players across time and the offensive metrics can be reduced to seven metrics, which is a substantial reduction in the dimensionality.