Reconstructing Box Plus/Minus

Home for all your discussion of basketball statistical analysis.
DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 4:55 pm

Hello:

I have been preparing for some time now to reconstruct Box Plus/Minus (BPM), with a goal of addressing the major existing issues.

Here are some issues that have been identified (I will add more to this post as more are brought forward):
  1. Poor handling of outliers on offense
  2. Mishandling of interaction terms (related to the first)
  3. Poor estimation of defense
  4. Poor handling of blocks (as shown by college BPM being dominated by block%)
Some of these can be readily addressed by changing the statistic entirely. However, I don't want to do that. BPM will remain a box-score-based statistic. The goal is to make a statistic that can easily be applied to other leagues and contexts that do not have such good data coverage. So these are the constraints:
  1. Box score stats only (i.e. anything that can be calculated from the stats we have from the 80s.)
  2. No PbP stats, not even things like "assisted by" ratios.
  3. Nothing super complex that can't be done by someone with Excel and a good knowledge of math.
  4. Focus on Explanation, not Prediction. What happens should be credited to the team. No luck adjustment. (A good explanatory stat can be converted to a predictive stat with appropriate regression to the mean.)
To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).

Here is a sample of the data:


What I am interested in on this forum thread is to get ideas from the public about possible ways to reformulate the metric to achieve these goals as comprehensively as possible.

This will be an ongoing effort for some time.

For reference, the current BPM formulation is written up at https://www.basketball-reference.com/about/bpm.html
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

eminence
Posts: 113
Joined: Sun Sep 10, 2017 8:20 pm

Re: Reconstructing Box Plus/Minus

Post by eminence » Fri Apr 12, 2019 5:59 pm

When we say 'box-score' that means min/pts/oreb/dreb/ast/stl/blk/tov/pf along with shooting %'s?

What size sample are you going for maximum 'accuracy' on? Ranging from single game to full season(more?)? Should the other still be weighted?

Also, the 5yr RAPM looks really nice! I love Korver at near outlier level from '12-'16.

bbstats
Posts: 224
Joined: Thu Apr 21, 2011 8:25 pm
Location: Boone, NC
Contact:

Re: Reconstructing Box Plus/Minus

Post by bbstats » Fri Apr 12, 2019 6:05 pm

2 half-baked ideas:

1) get even more interaction terms (i.e. estimate what proportion of baskets are assisted etc)
2) find mathematically sound way of capping / preventing outlier seasons from breaking the prediction

for #2 you could theoretically just generate "possible stat lines" i.e. RWB's MVP season but no idea how you'd incorporate that into the rating

bbstats
Posts: 224
Joined: Thu Apr 21, 2011 8:25 pm
Location: Boone, NC
Contact:

Re: Reconstructing Box Plus/Minus

Post by bbstats » Fri Apr 12, 2019 6:07 pm

one good interaction that may or may not improve sample size is "turnovers vs expected" i.e. projected turnovers from stat line vs actual turnovers. that has a very good RSQ in general and to me seems pretty important

also (GamesStarted/GamesPlayed)^2 seems to be a good one

Crow
Posts: 6051
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 12, 2019 6:25 pm

Look at how much assist rate over role / position (PG, wing, big of primary, secondary, tertiary) contributes to team assist rate differential vs. league average. If you dominate the ball but don't make team above average, you probably aren't as good as your assist rate alone suggests.

While the assist-rebound interactive term may have helped on average, it seemed to overly penalize the extremes. I'd see if you could dampen the extremes or reduce the size of the factor in general. This is not a great case for saying that having the same person rebound and distribute is more valuable than having two separate players at the same levels.

If you are going to have a versatility reward interactive term it should include scoring and probably shot defense.

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 6:27 pm

eminence wrote:
Fri Apr 12, 2019 5:59 pm
When we say 'box-score' that means min/pts/oreb/dreb/ast/stl/blk/tov/pf along with shooting %'s?

What size sample are you going for maximum 'accuracy' on? Ranging from single game to full season(more?)? Should the other still be weighted?

Also, the 5yr RAPM looks really nice! I love Korver at near outlier level from '12-'16.
Yes, standard box score. Also playing time, minutes, should be ok. I'm not sure about games started--I'd prefer not to include that, since it may not be readily available for all applications.

I would like to see the metric stabilize within a few hundred minutes. A stripped down version with no interaction terms would work better at the game level.

Yes, Korver is interesting, to be sure. Not spectacular numbers in those years...except for 3pt%. His off-ball movement always scrambled defenses, which a box score probably can't capture.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 6:29 pm

Crow wrote:
Fri Apr 12, 2019 6:25 pm
Look at how much assist rate over role / position (PG, wing, big of primary, secondary, tertiary) contributes to team assist rate differential vs. league average. If you dominate the ball but don't make team above average, you probably aren't as good as your assist rate alone suggests.
This is an interesting perspective... Probably relates to blocks on the defensive side as well. (Hello, Hassan Whiteside)
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6051
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 12, 2019 6:42 pm

No play by play stats is a big choice.

Assuming you stick with that, it appears scoring defense is either same for everybody regardless of who you guard or if you are even on court or not via a team level adjustment, or not include at all... or estimated in a semi-complicated but doable and maybe helpful way.

What about starters' defense estimated as 40% the presumed starter matchup scoring defense , 15% the average of the next two closest positions starters, 10% the average of the other 2 starters, 20% of the average of the sub matchups at the position, 15% of the average of the subs at other positions? Defense is shared. This is rough but it may be better than everybody gets the same grade or no grade at all.

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 6:45 pm

Crow wrote:
Fri Apr 12, 2019 6:42 pm
No play by play stats is a big choice.

Assuming you stick with that, it appears shot defense is either same for everybody regardless of who you guard or if you are even on court or not via a team level adjustment, or not include at all... or estimated in a semi-complicated but doable and maybe helpful way.
Something position-based could be workable (i.e. centers get more credit for 2 pointers?).
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6051
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 12, 2019 6:54 pm

I added one way of doing it above.

Your quick responses are encouraging.

My weights try to account for general substitution patterns and who guys switch onto the most. Reverse the substitution related proportions for subs.

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 7:17 pm

Crow wrote:
Fri Apr 12, 2019 6:54 pm
I added one way of doing it above.

Your quick responses are encouraging.

My weights try to account for general substitution patterns and who guys switch onto the most. Reverse the substitution related proportions for subs.
I will be working with season-level data, so something based on individual matchups will not be feasible. In other words, we may have season-long opponent 2pt%, but we won't have it broken out by position.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

sbs
Posts: 14
Joined: Fri Oct 19, 2012 7:25 am

Re: Reconstructing Box Plus/Minus

Post by sbs » Fri Apr 12, 2019 8:21 pm

DSMok1 wrote:
Fri Apr 12, 2019 4:55 pm
To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).
Can you go into anymore detail on the approach here?

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 8:33 pm

sbs wrote:
Fri Apr 12, 2019 8:21 pm
DSMok1 wrote:
Fri Apr 12, 2019 4:55 pm
To do this project, I have worked with an NBA team (special thanks to them!) to develop an improved RAPM basis. This basis provides average RAPM over 6 year eras, and handles aging/role changes via a Bayesian prior. These shorter eras allow a far better coverage of outliers (like LeBron).
Can you go into anymore detail on the approach here?
Briefly: construct a simple prior based on MPG and team quality. Subtract that out of matchup data, run the RAPM, and then add it back in as a postprocessing step.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

eminence
Posts: 113
Joined: Sun Sep 10, 2017 8:20 pm

Re: Reconstructing Box Plus/Minus

Post by eminence » Fri Apr 12, 2019 8:59 pm

I'm not sure how position would be implemented reliably from the box score, height/weight or what?

DSMok1
Posts: 899
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 12, 2019 9:07 pm

eminence wrote:
Fri Apr 12, 2019 8:59 pm
I'm not sure how position would be implemented reliably from the box score, height/weight or what?
It is relatively easy to determine position strictly from the box score. No, height/weight would not be included.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Post Reply