RAPM Prior Improvements
Posted: Thu Jan 25, 2024 2:36 pm
I have been working to develop a revised RAPM prior to use for long-term (i.e. 3+ years) RAPM datasets.
I demonstrated some years ago that MPG was a very good prior. I use regressed MPG, using 4 games of 0 MPG, to adjust players who played very few minutes.
What I found is that MPG has a consistent positive and linear relationship with offensive production. However, that relationship breaks down for stars, who are "capped" on minutes but their production can be far above the linear relationship.
On defense, I found that there is almost no relationship between MPG and defensive production. This indicates that playing time is selected primarily based on offensive production, which makes sense because the spread in offensive production appears to be significantly wider than the spread in defensive production.
In addition--the "bench"/0 MPG player intercept is well below zero for offensive players, in the range of -4 pts/100 poss, while the defensive intercept is very close to 0.
Recently I've explored a 2D position spectrum, with results here: https://public.tableau.com/views/ofNBAS ... zHome=no#1
I applied those positions to a revision of this RAPM prior. I found that the offensive creation role had a mild positive effect on offensive production, only +1.5 from a minimum creation role to a max creation role. I found the "size" position had a +2.5 spread on defensive production from minimum size to maximum size.
Finally, to handle the "star" issues with the MPG linear relationship with production, I pulled in NBA awards data. Yes, I know this could be faulty and could overrate Kobe, but it proved useful nonetheless. Here are the values I found for the awards:
Note: the DPOY and MVP use the actual award shares rather than a boolean True/False. So a unanimous MVP or DPOY gets that full credit. Also, these are additive. The MVP usually also is All NBA 1st team and All Star.
The end result is a prior that has a 0.87 offensive correlation with a long term RAPM dataset and a 0.59 defensive correlation with the RAPM dataset.
Here are the full values I'm using:
The coefficient on ReMPG for each team is calculated separately after applying all of the above values, such that the total team production sums to the team's adjusted offensive or defensive efficiency.
EDIT: See post below for updated values.
I demonstrated some years ago that MPG was a very good prior. I use regressed MPG, using 4 games of 0 MPG, to adjust players who played very few minutes.
What I found is that MPG has a consistent positive and linear relationship with offensive production. However, that relationship breaks down for stars, who are "capped" on minutes but their production can be far above the linear relationship.
On defense, I found that there is almost no relationship between MPG and defensive production. This indicates that playing time is selected primarily based on offensive production, which makes sense because the spread in offensive production appears to be significantly wider than the spread in defensive production.
In addition--the "bench"/0 MPG player intercept is well below zero for offensive players, in the range of -4 pts/100 poss, while the defensive intercept is very close to 0.
Recently I've explored a 2D position spectrum, with results here: https://public.tableau.com/views/ofNBAS ... zHome=no#1
I applied those positions to a revision of this RAPM prior. I found that the offensive creation role had a mild positive effect on offensive production, only +1.5 from a minimum creation role to a max creation role. I found the "size" position had a +2.5 spread on defensive production from minimum size to maximum size.
Finally, to handle the "star" issues with the MPG linear relationship with production, I pulled in NBA awards data. Yes, I know this could be faulty and could overrate Kobe, but it proved useful nonetheless. Here are the values I found for the awards:
Code: Select all
Off Def Total
Award_DPOY -1.500 3.000 1.5
Award_MVP 2.500 0.000 2.5
Award_AllNBA1st 1.500 0.000 1.5
Award_AllNBA2nd 1.000 0.000 1
Award_AllNBA3rd 1.000 0.000 1
Award_AllDef1st -0.500 1.500 1
Award_AllDef2nd -0.500 1.500 1
Award_AllStar 1.000 0.000 1
The end result is a prior that has a 0.87 offensive correlation with a long term RAPM dataset and a 0.59 defensive correlation with the RAPM dataset.
Here are the full values I'm using:
Code: Select all
Off Def
Award_DPOY -1.50 3.00
Award_MVP 2.50 0.00
Award_AllNBA1st 1.50 0.00
Award_AllNBA2nd 1.00 0.00
Award_AllNBA3rd 1.00 0.00
Award_AllDef1st -0.50 1.50
Award_AllDef2nd -0.50 1.50
Award_AllStar 1.00 0.00
Size Position Slope 0.00 0.625
Offensive Role Slope 0.375 0.00
Intercept -4.500 -1.250
Team Rtg Intercept Adj: 0.070 0.00
EDIT: See post below for updated values.