I want to predict not-started season contribution of player by completed season data.
Predicted ratings are good or bad extremely when players who play few minutes record good or bad stats extremely.
So, I make a method that stats are regressed to mean, then calculate all-in-one metrics. I called this methods 'linear padding'.
Linear padding is realized using linear regression models. For example, let's consider prediction 2022-23 season EFF (efficiency), the simplest all-in-one metric , by older data. In this example, stats are per 36 minutes adjusted then EFF is calculated.
To calculate linear padded pts, pts each player in 2021-22 season are regressed by pts each player in 2020-21. Regressions are performed each stats that using to calculate EFF. These regression models are used to predict each stats in 2022-23 season by 2021-22 season stats. Predicted 2022-23 season stats are used to calculate EFF.
Using data in Japanese professional basketball league (B-League) 2021-22 season, I compared correlation coeficient between linear padded EFF or raw EFF and 2022-23 season EFF. I also calculate RMSE. In results, using linear padding makes correlation higher(0.869 vs. 0.854, p<0.01) and RMSE smaller(2.41 vs. 2.53).
Using linear padding, between season correlation of EFF is higher and predictive error is smaller than using raw stats. So, I guess not only EFF but also more sophisticated metrics improve stability using linear padding.
Please tell me other methods that improve all-in-one metrics' stability. Thanks.
Stabilize all-in-one metrics
Re: Stabilize all-in-one metrics
Nice work!
Padding statistics to achieve stabilization is an informal form of using a Bayesian prior, i e information about the overall talent pool distribution, to regress the data toward a reasonable prior. There are a number of articles on the topic.
Kostya Medvedovsky wrote one of the best summary articles on the topic here: https://kmedved.com/2020/08/06/nba-stab ... -approach/
In essence the whole DARKO projection system is using an advanced form of this approach to achieve Bayesian projections for each statistic.
Padding statistics to achieve stabilization is an informal form of using a Bayesian prior, i e information about the overall talent pool distribution, to regress the data toward a reasonable prior. There are a number of articles on the topic.
Kostya Medvedovsky wrote one of the best summary articles on the topic here: https://kmedved.com/2020/08/06/nba-stab ... -approach/
In essence the whole DARKO projection system is using an advanced form of this approach to achieve Bayesian projections for each statistic.