Reconstructing Box Plus/Minus

Home for all your discussion of basketball statistical analysis.
eminence
Posts: 138
Joined: Sun Sep 10, 2017 8:20 pm

Re: Reconstructing Box Plus/Minus

Post by eminence » Wed Apr 17, 2019 5:04 pm

Did you just try to minimize squared error the first time around? If you're looking to reign in outliers you could look at raising your p in your L^p norm minimization? Not sure how much that'd change your average error or how well it'd work, but worth looking into.

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Wed Apr 17, 2019 5:15 pm

RyanRiot wrote:
Wed Apr 17, 2019 4:20 pm
If you're comfortable with including a position variable, you can try including a position and 3PA rate interaction like ATC used in Dredge.
That's a good thought, I'll look into that.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Wed Apr 17, 2019 5:17 pm

eminence wrote:
Wed Apr 17, 2019 5:04 pm
Did you just try to minimize squared error the first time around? If you're looking to reign in outliers you could look at raising your p in your L^p norm minimization? Not sure how much that'd change your average error or how well it'd work, but worth looking into.
I did originally, yes. Because I was using very long-term data, the spread of the box-score values were relatively compressed as well, increasing the issues with outliers.

I will look into that option.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

bbstats
Posts: 224
Joined: Thu Apr 21, 2011 8:25 pm
Location: Boone, NC
Contact:

Re: Reconstructing Box Plus/Minus

Post by bbstats » Fri Apr 19, 2019 1:37 pm

Random aside - are per-100 stats in the scope of BPM 2.0?

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 19, 2019 2:39 pm

bbstats wrote:
Fri Apr 19, 2019 1:37 pm
Random aside - are per-100 stats in the scope of BPM 2.0?
Yes. They already are included in the current version, in essence.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6208
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 19, 2019 4:09 pm

Crow wrote:
Sat Apr 13, 2019 12:20 am
Instead of treating minutes played as a linear variable, what about a step-wise function? Maybe treat players over 25 minutes the same or closer to the same than in the linear approach. 15-25 minutes per game and a high % of games when health as the second step. Then third and fourth steps. The argument here is that the LEVEL of how much you play is determined by relative quality but the exact level of minutes is also influence by team need which is different. I dunno of this makes much difference but I float it as a possibility.

Why not include FT rate in current or future BPM? If 3pt rate can give bonuses, I'd want FT rate to give bonuses and penalties. It affects "space" and overall team scoring rates.

Personal fouls are not included at all? What was the analysis that lead to that? Any re-think? What about deductions for technicals and ejections?

Any consideration of when a player shoots in the shot clock and during the game? Adjustments for carrying more or less of clucthtime and crunchtime shooting?

Any consideration of including on/off data (a la PIPM or metric blends)? Raw or RAPM. If not, what is the rationale or defense?

Should blocks against be treated as worse than regular misses? Or better? What does the data show for recovery rates, opponent points off those that result in change of possession an own points on second chances?

Charges taken? You probably aren't going to do because it probably isn't available deep in past. But does BPM 2.0 have to go deep into past? I mainly care about now and future.

If versatility is important then it would seem that positional versatility is important too, in general or especially with regard to defense. Would you consider using Knarsu3's data on that?

If minutes are considered, what about age? Both have correlations. Which is stronger?

If minutes are considered, what about salaries? Both have correlations. Which is stronger?

Minutes relative to age and salary?

Instead of straight height, did you ever consider using height relative to league average for main position? Weight is probably shakier and less likely but not including is a choice.

FT% as a proxy of "true" shooting talent?

Any adjustments for "luck" of any kind?

Rewards or penalties for high or low usage beyond the broad mid-range? Linear or step-wise?

Bonuses for major award votes?

Being traded?

Draft pick #? College recruiting rank? I assume some GMs and analytic staffs use them before draft. How much predictive value do they for early career? And by that I mean, for this purpose, predictive for things not in the boxscore right now?
If you are strictly reporting performance, then a bonus for minutes played goes beyond that to estimating uncounted impact. Age and salary are similar ways of getting at perceived value.

None of the ideas raised in this post have gotten a response yet. Considering any or ignoring all?

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 19, 2019 4:47 pm

Crow wrote:
Fri Apr 19, 2019 4:09 pm
Instead of treating minutes played as a linear variable, what about a step-wise function? Maybe treat players over 25 minutes the same or closer to the same than in the linear approach. 15-25 minutes per game and a high % of games when health as the second step. Then third and fourth steps. The argument here is that the LEVEL of how much you play is determined by relative quality but the exact level of minutes is also influence by team need which is different. I dunno of this makes much difference but I float it as a possibility.
I will definitely be including minutes in some manner. Very significant variable. The exact formulation needs some exploration. I want this to be transferable to non-NBA situations easily.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
Why not include FT rate in current or future BPM? If 3pt rate can give bonuses, I'd want FT rate to give bonuses and penalties. It affects "space" and overall team scoring rates.

Personal fouls are not included at all? What was the analysis that lead to that? Any re-think? What about deductions for technicals and ejections?
I explored FT in the past and found no added information/significance. I will check them again. Maybe nonlinear will be more significant?
Crow wrote:
Fri Apr 19, 2019 4:09 pm
Any consideration of when a player shoots in the shot clock and during the game? Adjustments for carrying more or less of clucthtime and crunchtime shooting?

Any consideration of including on/off data (a la PIPM or metric blends)? Raw or RAPM. If not, what is the rationale or defense?

Should blocks against be treated as worse than regular misses? Or better? What does the data show for recovery rates, opponent points off those that result in change of possession an own points on second chances?

Charges taken? You probably aren't going to do because it probably isn't available deep in past. But does BPM 2.0 have to go deep into past? I mainly care about now and future.
This is outside the scope of BPM, which will be based purely on box scores. I welcome others (like Jacob with PIPM) to expand beyond the box scores.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
If versatility is important then it would seem that positional versatility is important too, in general or especially with regard to defense. Would you consider using Knarsu3's data on that?
I may use his data to attempt to construct something using box score data. Positions are currently not in BPM, but I feel they need to be explored further--as long as they can easily be estimated from box score data alone!
Crow wrote:
Fri Apr 19, 2019 4:09 pm
If minutes are considered, what about age? Both have correlations. Which is stronger?

If minutes are considered, what about salaries? Both have correlations. Which is stronger?

Minutes relative to age and salary?

Instead of straight height, did you ever consider using height relative to league average for main position? Weight is probably shakier and less likely but not including is a choice.
None of these are relevant outside of the NBA context and require data beyond the box score, so I will not be exploring this avenue.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
FT% as a proxy of "true" shooting talent?
This is a very good idea. I believe a "shooting" measure will be very significant in improving the regression.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
Any adjustments for "luck" of any kind?
I don't think luck can be easily estimated from straight box score data. My current thought is to make this a purely explanatory metric, not predictive. This could be interesting to explore.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
Rewards or penalties for high or low usage beyond the broad mid-range? Linear or step-wise?
I definitely think that "gravity/attention" should be explored in a nonlinear manner.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
Bonuses for major award votes? Being traded?
Draft pick #? College recruiting rank? I assume some GMs and analytic staffs use them before draft. How much predictive value do they for early career? And by that I mean, for this purpose, predictive for things not in the boxscore right now?
None of these are relevant outside of the NBA context and require data beyond the box score, so I will not be exploring this avenue.
Crow wrote:
Fri Apr 19, 2019 4:09 pm
If you are strictly reporting performance, then a bonus for minutes played goes beyond that to estimating uncounted impact. Age and salary are similar ways of getting at perceived value.

None of the ideas raised in this post have gotten a response yet. Considering any or ignoring all?
You've got responses now! Thanks for the input, Crow.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6208
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 19, 2019 5:40 pm

Just asking for (on this post in response for request for feedback) what everyone else got without asking.

Thanks for the replies on most ideas. Treatment of personal fouls, technicals / ejections not mentioned but I guess you've decided not to do anything there.

BPM is available for college basketball players at Basketball-Reference. Assume it is unaltered for the change of league? Not all the proportions of significance between metric components are likely to be the same. May not be that different but the 3 point factor is of greater importance there. Have you tried to encourage BRef to apply BPM to Euro basketball or WNBA?

College recruiting rank could be applied to college. I didn't expect you'd apply it, but was worth at least mentioning in a full sweep of possibles. Minutes can cover.

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Fri Apr 19, 2019 5:51 pm

Crow wrote:
Fri Apr 19, 2019 5:40 pm
Just asking for (on this post in response for request for feedback) what everyone else got without asking.

Thanks for the replies on most ideas. Treatment of personal fouls, technicals / ejections not mentioned but I guess you've decided not to do anything there.

BPM is available for college basketball players at Basketball-Reference. Assume it is unaltered for the change of league? Not all the proportions of significance between metric components are likely to be the same. May not be that different but the 3 point factor is of greater importance there. Have you tried to encourage BRef to apply BPM to Euro basketball or WNBA?

College recruiting rank could be applied to college. I didn't expect you'd apply it, but was worth at least mentioning in a full sweep of possibles. Minutes can cover.
PF will be investigated again--I didn't find any significance at all last time. Technicals and Ejections are not readily available historically and mean different things for different leagues.

BPM has been implemented for a wide range of leagues, just not necessarily in public forums.

I want BPM to be as generic as possible to allow for this wide usefulness.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6208
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Fri Apr 19, 2019 6:31 pm

Ok, I understand some of the decisions better if BPM is getting actively but privately applied to other leagues.



Crow

Why not include FT rate in current or future BPM? If 3pt rate can give bonuses, I'd want FT rate to give bonuses and penalties. It affects "space" and overall team scoring rates.

DSMok1

I explored FT in the past and found no added information/significance. I will check them again. Maybe nonlinear will be more significant?

Maybe it comes down to how the credit is awarded. If 3pt spacers are getting a lot of credit, it may be because the insider scorers that helped create the space (on that play and in general) are not getting credit. Maybe look at value of space by spacers on teams with elevated inside scoring (including but not limited to to FT) vs without that assistance? RAPM player factor, pairs and lineup analysis might help.

nbacouchside
Posts: 113
Joined: Sun Jul 14, 2013 4:58 am
Contact:

Re: Reconstructing Box Plus/Minus

Post by nbacouchside » Sat Apr 20, 2019 3:57 am

DSMok1 wrote:
Fri Apr 19, 2019 4:47 pm

This is a very good idea. I believe a "shooting" measure will be very significant in improving the regression.
As a measure of shooting, I might suggest Ben Taylor's 3 point proficiency:
(2/(1+EXP(-3PA))-1)*3P%
It's part of his larger Box Score Creation metric (which might also be worth potentially using, although you may want to use the one without the 3pt component since the 3 pt version doesn't hold up as well as you go back in history). https://fansided.com/2017/08/11/nylon-c ... box-score/

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Sat Apr 20, 2019 11:12 am

Yes, Ben's box creation formulas are intriguing. I will be evaluating what can be borrowed reasonably.

Thinking about shooting, we know that 2-point percentage has nothing to do with shooting skill. However, free throw shooting has everything to do with shooting skill. It would be interesting to contrast free throw shooting with 2-point field goal percentage as a means to estimate the spatial gravity a player has. Obviously three point shooting is the best tool there, but I have often wondered how to differentiate a pure post player from a player with a exceptional jumper like Tim Duncan or Kevin Garnett or Karl Malone.

That goes along with an approach I have been pondering, where the box score statistics are used to impute underlying skill sets, rather than directly imputing overall player quality. In other words, there are underlying skill sets reflected in the box score data that are more directly relevant to a players impact on the game.

For instance, shooting, shot creation, rolling/lob threat/finishing, and rebounding are four skills on offense. We can more directly measure some of those skill sets with modern data, which we could then use as a basis to develop box score estimates of those items. Those skills likely predict more accurately a player's impact than just looking at their box score data.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Crow
Posts: 6208
Joined: Thu Apr 14, 2011 11:10 pm

Re: Reconstructing Box Plus/Minus

Post by Crow » Sat Apr 20, 2019 2:35 pm

If you are trying to identify skill levels, you might want to look closely at what is showing at bball-index.com and perhaps talk with them about methods, data. See how what they say compares to whatever else you look at & do.

colts18
Posts: 304
Joined: Fri Aug 31, 2012 1:52 am

Re: Reconstructing Box Plus/Minus

Post by colts18 » Sun Apr 21, 2019 11:20 pm

Dsmok,

One way you can improve the accuracy is to take into account missed games when you adjust a players stat on a team level. That way a player whose team falls off when he is out gets more credit and does not get unfairly penalized for his teams bad results in games he missed.

Example:

Current BPM:

The Lakers are a 37 win, -1.33 SRS team

LeBron puts up a 27/9/8 statline that gives him a high unadjusted BPM

Then LeBron and his teammates BPM gets adjusted so that the Lakers results line up to a 37 win -1.33 SRS team. As a result LeBron's BPM is 8.1


New system:

Instead of adjusting LeBron's numbers to the Lakers 82 game total, adjust his numbers to ONLY the 55 games he played.

The Lakers were 28-27 with LeBron, ~0 SRS

LeBron unadjusted BPM + Teammate 1 Unadjusted BPM (in those 55 games) + teammate 2 unadjusted BPM (in those 55 games) + etc. = ~0 SRS

Teammate 1 unadjusted BPM (27 games without LeBron) + teammate 2 unadjusted BPM (27 games without LeBron) + etc.= ~ -3 SRS in games LeBron missed.

DSMok1
Posts: 904
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Reconstructing Box Plus/Minus

Post by DSMok1 » Mon Apr 22, 2019 11:11 am

That makes a lot of sense, Colts18. I would like to somehow do things that way, but it certainly adds a lot of computational complexity. For many leagues, that data will not be readily available.

I also don't think it can actually work if the metric is nonlinear. For a linear metric, it would be a very good choice. Effectively, you would have to do the analysis for each game individually, since the players available vary almost every game. This means there would be a ton of outlier box score statistics, so that could only be handled by a linear metric.

Note: developing a truly linear version of Box Plus/Minus would probably be a good idea. It could then be used on individual games like SPR or DRE (https://fansided.com/2015/02/23/introdu ... le-metric/ ). Using individual games and constructing an overall number as you have mentioned above would give a nice counterpoint to a nonlinear, season-long BPM construction.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Post Reply