a) more data
b) I've included stats that are not available in the BoxScore and thus have to be figured out through the PBP (see here)
I'm posting results of the first run, but more versions are sure to come. Thus far I've kept it relatively simple from the 'shooting' standpoint, only including PPS and '# of shots', with the latter being (FGA+(FTA-AND1s)/2) where AND1s includes 'unsuccesful AND1s'. I'm definitely open to suggestions, whether to include more variables (has to be either on BBR or something I can figure out through the PBP) or whether to try certain interaction effects.
Another 'fun' exercise could be to include already existing metrics like PER and ORtg/DRtg in the regression. This is potentially interesting because the regression would be able to tell us which metric is 'better' (since '02), and could tell us potential weaknesses of each metric (say we include PER and various BoxScore stats, and assuming PER gets a positive coefficient for offense. If we then get a large negative coefficient for 'FGA' we 'know' PER overrated the chuckers (again, since '02, can't make any statements for the years before))
I'm using z-scores for the variables, so a coefficient of 2 has twice the impact on the player rating than a coefficient of 1 does
Coefficients for offense
Code: Select all
MP 2.92
pps 1.94
shots 0.88
AST 0.83
off_rebound_ft 0.60
off_rebound_fg 0.57
STL 0.30
height 0.19
weight 0.11
blocks_to_off 0.06
dead_to -0.04
blocks_to_def -0.13
live_to -0.16
goaltends -0.16
def_rebound_fg -0.42
G -0.56
def_rebound_ft -0.66
PF -1.22
GS -1.55
Code: Select all
def_rebound_fg 1.37
blocks_to_def 1.11
STL 1.05
GS 1.04
height 0.56
AST 0.47
off_rebound_ft 0.36
weight 0.34
blocks_to_off 0.31
pps 0.23
MP 0.02
G -0.07
PF -0.08
def_rebound_ft -0.18
goaltends -0.37
dead_to -0.40
live_to -0.45
off_rebound_fg -0.52
shots -1.10
- 'off_rebound_fg' is an offensive rebound after a FGA, rather than an FTA. Vice versa for 'off_rebound_ft'.
- goaltends are defensive goaltends only
- 'live_to'/'dead_to' are live/dead ball turnovers
- 'blocks_to_def' are blocks that get recovered by the defensive team. Vice versa for 'off'
The large difference for the (defensive) coefficients for 'blocks_to_def' vs. 'blocks_to_off' and 'def_reb_fg' vs. 'def_reb_ft' tells me there's definitely value in extracting those kinds of stats from the PBP
The player values for '13-'14 are here. It's not a fan of guards that don't assist much and don't shoot very well. Has the Lopez brothers in the top 10. Definitely a little weird, although they also sport a 125/121 ORtg and 108/107 DRtg this season
R^2 for '14 SPM and RAPM is 0.35 for offense, 0.2 for defense. I'm not gunning for higher R^2 ('m gunning for lower OOS prediction error for offensive efficiency of 5-on-5 lineups), but higher R^2 is probably desirable, too