Prediction with BoxScore totals
Posted: Sun Sep 09, 2012 1:29 pm
I wanted to see which BoxScore stats predict 5on5 matchups in the following season the best, so I did this:
Using BoxScore totals, find out which weighing scheme gives the lowest MSE for expected vs actual points per possession for every 5on5 matchup in the following season. (ignoring matchups which contain players that we've never seen before)
So, '07 BoxScore totals were used to create player values which then were used to predict '08.
'08 BoxScore totals were used to predict '09, and '09 was used to predict '10 (I stopped here)
The goal was to find the weighing scheme that minimizes prediction error for all 3 forecasted seasons ('08+'09+'10) combined
For offense, I came up with
(0.000007minutes+0.27FGM-0.06FGA+0.25 3PM+0.14 3PA+0.34 FTM-0.07FTA+0.38OReb-0.03DReb+0.02TReb+0.25ASS+0.06Steal-0.02Block-0.3Turnover+0.1Foul-0.02Points-0.08)*100
and defense
(0.000009minutes+0.1FGM-0.02FGA+0.3 3PM-0.06 3PA+0.04 FTM+0.02FTA+0.0 OReb+0.22DReb+0.0TReb+0.02ASS+0.8Steal+0.37Block-0.2Turnover-0.04Foul-0.03Points-0.06)*100
(everything but minutes is per minute)
Players with <200 minutes in a season had their rating influenced by minutes only.
Player ratings are here
http://stats-for-the-nba.appspot.com/PBP/ranking09.html
http://stats-for-the-nba.appspot.com/PBP/ranking08.html
http://stats-for-the-nba.appspot.com/PBP/ranking07.html
In the end, I'm probably going to use this as priors for RAPM, but I'll have to check first if "BoxScoreTotals informed RAPM" outperforms "RAPM informed RAPM" in forecasting
The player ratings look mostly OK to me, so I guess it passes the smell test. Obviously it's not without some weird names at the top.
'07 has Arenas #1 (back then he had a 24 PER)
'08 has Camby at #3, Stoudemire #6, Biedrins ~#15
'09 has Troy Murphy #7, Nate Robinson #12
Using BoxScore totals, find out which weighing scheme gives the lowest MSE for expected vs actual points per possession for every 5on5 matchup in the following season. (ignoring matchups which contain players that we've never seen before)
So, '07 BoxScore totals were used to create player values which then were used to predict '08.
'08 BoxScore totals were used to predict '09, and '09 was used to predict '10 (I stopped here)
The goal was to find the weighing scheme that minimizes prediction error for all 3 forecasted seasons ('08+'09+'10) combined
For offense, I came up with
(0.000007minutes+0.27FGM-0.06FGA+0.25 3PM+0.14 3PA+0.34 FTM-0.07FTA+0.38OReb-0.03DReb+0.02TReb+0.25ASS+0.06Steal-0.02Block-0.3Turnover+0.1Foul-0.02Points-0.08)*100
and defense
(0.000009minutes+0.1FGM-0.02FGA+0.3 3PM-0.06 3PA+0.04 FTM+0.02FTA+0.0 OReb+0.22DReb+0.0TReb+0.02ASS+0.8Steal+0.37Block-0.2Turnover-0.04Foul-0.03Points-0.06)*100
(everything but minutes is per minute)
Players with <200 minutes in a season had their rating influenced by minutes only.
Player ratings are here
http://stats-for-the-nba.appspot.com/PBP/ranking09.html
http://stats-for-the-nba.appspot.com/PBP/ranking08.html
http://stats-for-the-nba.appspot.com/PBP/ranking07.html
In the end, I'm probably going to use this as priors for RAPM, but I'll have to check first if "BoxScoreTotals informed RAPM" outperforms "RAPM informed RAPM" in forecasting
The player ratings look mostly OK to me, so I guess it passes the smell test. Obviously it's not without some weird names at the top.
'07 has Arenas #1 (back then he had a 24 PER)
'08 has Camby at #3, Stoudemire #6, Biedrins ~#15
'09 has Troy Murphy #7, Nate Robinson #12