I would say BPM and VORP are probably the best for that question, yes, but are not conclusive alone. RPM/RAPM should also be considered since they better capture defense.permaximum wrote: Thanks for the answers and explanation. I knew most of the things you mention but I guessed it's good to have them here for visitors.
Since we're talking about the mainstream, what do you think about the use of BPM for the league awards? Will BPM give us more accurate ratings for MVP, MIP, ROK, ALL-NBA Team etc. than EFFICIENCY, PER, WS, WP, WAR, RPM, RAPM? Today, is it the best statistical argument for the MVP race, at the end of the season?
The debut and popularization of BPM
Re: The popularization of BPM
Re: The popularization of BPM
You know all this.. how? Did you ever do any out of sample or retrodiction testing with this metric? The only "test of validity" I've ever seen you report was a high in-sample R^2.DSMok1 wrote:1. Best box score metric, has the advantages and disadvantages of box score metrics.
BPM is far, far better than RAPM
In the last 10 years, statisticians have been literally getting stoned for using in-sample R^2 to describe the quality of their model
Re: The popularization of BPM
I'm sorry if I wasn't clear. That's not the full quote. I was saying that if you have a few games worth of data, BPM is better than RAPM. I don't think that's in question... the standard errors on even a 10 game sample of BPM are relatively small, whereas the standard errors on RAPM on 10 games worth of data would be massive.J.E. wrote:You know all this.. how? Did you ever do any out of sample or retrodiction testing with this metric? The only "test of validity" I've ever seen you report was a high in-sample R^2.DSMok1 wrote:1. Best box score metric, has the advantages and disadvantages of box score metrics.
BPM is far, far better than RAPM
In the last 10 years, statisticians have been literally getting stoned for using in-sample R^2 to describe the quality of their model
A rough stderr calc. for BPM (I'll refine soon) is sqrt((28/sqrt(n))^2+1.4^2), where n is minutes. If a player has played 200 minutes, the standard error would be 2.4... it stabilizes quite quickly.
The 1.4 is the inherent standard error from the lackings in the box score.
---
xRAPM/RPM is significantly better than BPM for estimating a current true talent level, and thus future predictions. RAPM, due to it's issues with stabilization, may or may not be better than BPM in predictive analysis--it's a question of whether the noise in the RAPM is worse or the missing pieces in the box score are worse.
---
As for the out of sample predictions... a couple of studies including ASPM were linked to in the writeup, showing ASPM besting other box score metrics and sometimes RAPM. BPM is better than ASPM; hopefully Neil Paine can update his out of sample checks sometime soon. Preliminarily, it was a clear winner over the other box score stats at least.
I'm aware of the limitations of in sample R^2.
Re: The popularization of BPM
Correlations with 2013-14 RPM and it's derivative WAR (designated rWAR, while bWAR is that derived from BPM).Columns are for all (437) NBA players appearing in both the RPM list and the BPM list; and for those with at least 600 minutes.
http://espn.go.com/nba/statistics/rpm/_/year/2014
http://www.basketball-reference.com/lea ... anced.html
Per minute (or possession) rates are on the left, season totals on the right.
bWAR2 is from BRP but with -3.035 as 'replacement level'.
Some of the higher correlations likely are due to similar weights and/or priors among these summary stats.
Total and per-game minutes have weak correlations with RPM.
Since player minutes represent what coaches think about players -- what might they think about these stats?
Correlations with minutes:It's a puzzlement why correlations are worse when low-minute guys are removed.
eWins doesn't reward mpg directly; it just boosts starters vs subs, and of course starters usually play more.
Note that dropping the replacement level seems to give more credibility to bWAR.
Code: Select all
corr all 600+ corr all 600+
RPM 1.00 1.00 rWAR 1.00 1.00
BPM .59 .71 bWAR2 .83 .81
WS/48 .54 .62 bWAR .82 .80
e484 .55 .54 WS .80 .78
PER .52 .52 eWins .76 .73
mpg .53 .46 MP .61 .57
http://espn.go.com/nba/statistics/rpm/_/year/2014
http://www.basketball-reference.com/lea ... anced.html
Per minute (or possession) rates are on the left, season totals on the right.
bWAR2 is from BRP but with -3.035 as 'replacement level'.
Some of the higher correlations likely are due to similar weights and/or priors among these summary stats.
Total and per-game minutes have weak correlations with RPM.
Since player minutes represent what coaches think about players -- what might they think about these stats?
Correlations with minutes:
Code: Select all
corr all 600+ corr all 600+
MP 1.00 1.00 mpg 1.00 1.00
eWins .84 .79 e484 .67 .61
WS .82 .75 PER .62 .56
bWAR2 .76 .72 BPM .61 .53
bWAR .68 .66 RPM .53 .46
rWAR .61 .57 WS/48 .45 .35
eWins doesn't reward mpg directly; it just boosts starters vs subs, and of course starters usually play more.
Note that dropping the replacement level seems to give more credibility to bWAR.
Re: The popularization of BPM
What is this zero of which you speak?schtevie wrote:... the very-well-understood, good ol' zero?
... the majority of NBA players... whose "value" is necessarily going to be "negative",...
... the myriad benefits that attend the clear intuition and presentational clarity of 0?
Is it NBA average? Is that like saying a person of average intelligence could be said to have zero intellectual value?
What if you're at a MENSA gathering? Do you have negative intelligence?
In the NBA Finals, only players near all-star level are above avg.
So there isn't actually anything intuitive about an arbitrary assignment of zero. If you're 100% of the average, your value is 1.00 of avg. Avg IQ is set at 100, and not at 0, for very sensible reasons.
Excellent! Bravo!WAR, what is it good for?
It's not good for much if it contradicts its own definitions (player salary, availability, etc)
Re: The popularization of BPM
Look at the tremendous difference between WS and WS/48 in the second chart. Are coaches over playing their best players? Is that the explanation for the somewhat worse correlation? Or am I missing something?
Re: The popularization of BPM
I've tried to learn to accept non-responses better but will ask if there is no one else with a desire for more explanation / defense of Daniel's particular BPM team adjustment? The adjustment may be different in some motivation and details than the Wins Produced team adjustment but on surface it feels somewhat similar in the way it "fixes" the final results. Daniel, is there really nothing in my comments that you are inclined to respond to after responding to several other issues from others? I'll accept non-responses this time if that is where it is left, but asked because I am trying to understand more.
Re: The popularization of BPM
WS correlates well with MP, presumably because generally better players get more minutes.
It doesn't correllate well on a per minute basis (ws/48) I guess because it does poorly with low-minute players. Guys with high TS% and low TO rate highly, even if they don't do much; and they don't get many minutes.
It doesn't correllate well on a per minute basis (ws/48) I guess because it does poorly with low-minute players. Guys with high TS% and low TO rate highly, even if they don't do much; and they don't get many minutes.
Re: The popularization of BPM
The WS - WS48 gap is huge for all player minutes and those over 600 minutes because the later group is still the large majority of minutes in the first group but it appears that the gap between Ws and Ws48 correlations with minutes must be dramatically smaller with small minute guys. That doesn't make surface sense to me and warrants further study. The intelligence of coach allocation of minutes should be open field of research not an assumed appropriate choice in all its detail. Really the debate should be about some marginal minute, not the first or fifteenth. More about the 25th and beyond probably.
With your starer-sub adjustments the gap between ewins and e484 is much smaller but still there. and in other similar cumulative vs per minute metric comparisons.
With your starer-sub adjustments the gap between ewins and e484 is much smaller but still there. and in other similar cumulative vs per minute metric comparisons.
Re: The popularization of BPM
Sorry, Crow! That was accidental. Somehow when I went through all the posts and responded I missed yours. I’ll try to address your comments.Crow wrote:I've tried to learn to accept non-responses better but will ask if there is no one else with a desire for more explanation / defense of Daniel's particular BPM team adjustment? The adjustment may be different in some motivation and details than the Wins Produced team adjustment but on surface it feels somewhat similar in the way it "fixes" the final results. Daniel, is there really nothing in my comments that you are inclined to respond to after responding to several other issues from others? I'll accept non-responses this time if that is where it is left, but asked because I am trying to understand more.
Incorrect. The 120% is translating the team context to neutral, an average team. It increases the accuracy of BPM. The team adjustment was applied as part of the regression developing coefficients, not after the fact.Crow wrote:BPM_Team_Adjustment is a bit challenging to accept and not misunderstand for me. So I had a few questions. They too are perhaps a bit challenging to understand but I want to try.
The BPM_Team_Adjustment makes the results of this metric a ranking rather than a precise individual impact estimate because of the 120% inflation, if I am following?
Yes, that would be preferable, but BPM intentionally only uses the box score data. The MPG term will account for this effect somewhat, since the RAPM used as the regression basis did include the “effect of being ahead” adjustment.Crow wrote:Would there be an acceptable way to adjust this adjustment so that it reflected the actual or actual - adjusted team performance data of individual players when on the court vs their teammates when on court, instead of giving the same adjustment to all based on all minutes, including when not on the court?
Would there be an acceptable and separate way to account for just the performance change seen for when teams are leading or trailing (by some unspecified amount) so that it reflect the actual performance impact for that player instead of being team or league average change? The play by play data exists for recent years and I assume JE essentially has a player specific adjustment because it is based on number of actual player minutes meeting his critieria (correct?)
I’m not sure how similar that would be. I don’t have data for the 14 year interval except the overall, offensive, and defensive estimates.Crow wrote:Even if one didn't redo the team adjustment, is there an understanding of how much of it is related to the blowout performance time issue versus other things about the team?
If one looked at BPM without the team adjustment and without blocks, would that be essentially equivalent in terms of what is covered / included to RPM (or RAPM) - defensive adjusted points per shot (more precise if it were for one year)? What is the R2 for defensive BPM and RPM (or RAPM) - defensive adjusted points per shot? Is it more impressive and, if so, shouldn't that be trumpeted to counter those who complain the r2 is too low to give it much weight?
BPM should actually have less “error from RAPM” than defensive RAPM—since I’m regressing onto many, many players, the error in RAPM would be evened out. That is, however, offset by the limitations of the box score.Crow wrote:I intend to compare defensive BPM to defensive RPM (or RAPM) - defensive adjusted points per shot. I wonder how close they are. If they are close on average I might start looking at defensive BPM + defensive adjusted points per shot. Is it correct to think that the "error" in RPM is present in the every component of BPM including the team adjustment? Is there any basis to suspect there is more error in the team adjustment quantity? Is there any basis to suggest that rather than remove col-linearity issues that they have just been shifted into the team adjustment? I am just asking, not actively presuming.
The team adjustment is an integral part of the regression, not added post-hoc. It can’t really be separated out.
Sounds tough to do, particularly given the lack of data and questionable utility of counterpart data. I think you’d end up just coming to the opposing team’s overall offensive rating…Crow wrote:If someone (not necessarily you Daniel, given your stated positions) wanted to separately try to model the missing defensive component not in current BPM (I'll call it broadly shot defense), what to try? Minutes again, team opponent pts per shot, counterpart pts per shot, height, years of experience, what else? Could some of these terms be significant for this portion of the project when they weren't for the original BPM effort? What significance does sqrt(AST%*TRB%) have for this portion of the project? Is there any basis for assuming that BPM or BPM enhanced with shot defense (via use of defensive adjusted points per shot or regression based model or combination) has less "error" than RPM / RAPM?
Anyone interested in running a DBPM that is an exact mirror of OBPM?
My old ASPM equations are still available at http://godismyjudgeok.com/DStats/aspm-and-vorp/Crow wrote:Overall for BPM what is most different at the stat coefficient level when compare to the last version of ASPM or the last public version's of Neil Paine's SPM? (to OR/DR winshares too)? For comparison with Neil's http://www.basketball-reference.com/blog/?page_id=4122 (is this the most recent public version?) He has TSA/36 separate from assists, whereas you have usage. PFs included in the model here.
(Age and height are 2 differences in a previous version http://www.basketball-reference.com/blog/?p=2191 Versatility Index was the cube root of Pts/40*Ast/40*Reb/40, where you you a ast/reb interaction term)
If one laid out BPM, shot defense enhance BPM, XRAPM, would there be any appropriate use for machine learning to find optimal blend of these metrics for retrodiction or prediction or both? Or to find a new metric that is essentially based on this optimal blend?
Is there anything in this article http://www.basketball-reference.com/blog/?p=8339 that contributes to the discussion?
In the regression what exactly is Shot%? It didn't make it into final BPM in any fashion? Not significant or...?
APMVAL is a crude and simplified form of Neil’s SPM.
Shot% is my shorthand term for USG%*(1-TO%) , usage with turnovers taken out.
14 year RAPM is the gold standard for measuring individual player quality, and is based on the play by play and game level data. That said, looking at out of sample predictive ability of stats would be informative.Crow wrote:Is there anything from old or new IPV that is different, discussable and potentially helpful for BPM or anything beyond it?
Is there different meaning / reason to compare BPM to play by play, game level or level level actual scoreboard instead of to RPM? (Isn't this one of Berri's main / old complaints? Is there any reason to address that further here and now?)
I know this jumps around and is probably behind the knowledge curve in some areas. Any clarifications or additional thinking about the topics will be appreciated.
Thanks for your input, Crow!
Re: The popularization of BPM
Thank you. I should always leave room for the accidental or non-intentional.
-
- Posts: 73
- Joined: Mon Apr 18, 2011 1:18 am
- Location: Philadelphia
- Contact:
Re: The popularization of BPM
I plan to do an even more thorough examination of this when I have time (which seems like never), but here is the evidence that Statistical Plus/Minus metrics (whether ASPM/BPM or SPM) are the best of the boxscore metrics...
Let me preface by saying the real test for any boxscore stat should be how it predicts team performance out of sample. It's long been known that any boxscore stat can boast a high correlation with team W% in-sample just by employing a team adjustment (like BPM does) or otherwise setting things up so that points scored/allowed and possessions employed/acquired add up at the individual level to team totals. What matters is how well a metric predicts the performance of a future team, given who its players are and how well those players have performed in the metric in the past.
I looked at metrics from that perspective here, and found that over the 2001-2012 period, ASPM did better than any other boxscore metric at predicting out-of-sample team performance, especially the further out of sample you go (using data from 2 and 3 years prior to predict Year Y). Over the summer, I also ran the same test using data from 1978-2014 for my SPM metric, Daniel's old ASPM (a version behind the current BPM), PER, WS/48 and a plus/minus estimate constructed from Basketball on Paper's ORtg/%Poss/DRtg Skill Curves.
(Just to expound on the BoP metric, I set up a fake 5-man unit w/ the player and 4 avg teammates. The teammates' ORtg changed based on the player's %Poss, like this. So If I use 25% of poss, my avg teammate uses 18.8% on avg. If tradeoff is 1.2, then he gains 1.2*(20-18.8) of ORtg.)
Btw, I came to the 1.2 tradeoff from running the same test over 1977-2014 (I didn't include ASPM in this test because it didn't extend back to 1977). Each BoP number represents the usage-efficiency tradeoff for that version of the metric:
In any event, no matter how many times I look at which metric does the best job of predicting future team performance, the Statistical Plus/Minus metrics are always in a class by themselves, particularly as you use data from further out of the sample being predicted. I still need to plug Daniel's new BPM into this framework, but I would be surprised if it didn't perform much better than PER and Win Shares/48 in a similar test.
Let me preface by saying the real test for any boxscore stat should be how it predicts team performance out of sample. It's long been known that any boxscore stat can boast a high correlation with team W% in-sample just by employing a team adjustment (like BPM does) or otherwise setting things up so that points scored/allowed and possessions employed/acquired add up at the individual level to team totals. What matters is how well a metric predicts the performance of a future team, given who its players are and how well those players have performed in the metric in the past.
I looked at metrics from that perspective here, and found that over the 2001-2012 period, ASPM did better than any other boxscore metric at predicting out-of-sample team performance, especially the further out of sample you go (using data from 2 and 3 years prior to predict Year Y). Over the summer, I also ran the same test using data from 1978-2014 for my SPM metric, Daniel's old ASPM (a version behind the current BPM), PER, WS/48 and a plus/minus estimate constructed from Basketball on Paper's ORtg/%Poss/DRtg Skill Curves.
(Just to expound on the BoP metric, I set up a fake 5-man unit w/ the player and 4 avg teammates. The teammates' ORtg changed based on the player's %Poss, like this. So If I use 25% of poss, my avg teammate uses 18.8% on avg. If tradeoff is 1.2, then he gains 1.2*(20-18.8) of ORtg.)
Code: Select all
+-----------+--------+--------+--------+--------+
| Metric | Year-1 | Year-2 | Year-3 | Year-4 |
+-----------+--------+--------+--------+--------+
| SPM-1 | .776 | .662 | .593 | .532 |
| ASPM-1 | .763 | .647 | .577 | .511 |
| PER-1 | .663 | .598 | .538 | .485 |
| bop_1.2-1 | .743 | .614 | .528 | .465 |
| WS48-1 | .734 | .598 | .515 | .462 |
+-----------+--------+--------+--------+--------+
Code: Select all
+-----------+--------+--------+--------+--------+
| metric | Year-1 | Year-2 | Year-3 | Year-4 |
+-----------+--------+--------+--------+--------+
| SPM-1 | .775 | .658 | .590 | .534 |
| PER-1 | .660 | .594 | .532 | .486 |
| bop_1.2-1 | .741 | .609 | .521 | .465 |
| bop_1.3-1 | .741 | .609 | .521 | .464 |
| bop_1.1-1 | .741 | .608 | .520 | .465 |
| bop_1.4-1 | .740 | .609 | .521 | .463 |
| bop_1.0-1 | .741 | .607 | .519 | .465 |
| bop_0.9-1 | .741 | .606 | .518 | .465 |
| bop_0.8-1 | .740 | .604 | .516 | .464 |
| WS48-1 | .733 | .593 | .507 | .463 |
+-----------+--------+--------+--------+--------+
Re: The popularization of BPM
So Neil, for the bop based metric, the specific usage-efficiency tradeoff is essentially non-impactful?
Re: The popularization of BPM
In Mike's second chart the correlation between minutes and ws/48 is .35. I can't imagine it going lower or much lower for most random fans in the stands
making the minute choices.
I guess the way defensive rating works probably hurts the correlation, obscures the player quality difference since everyone gets same shot defense rating. (I guess BAPM overcomes a good share of this indirectly on average, rathet than through individual pkay by play accounting).
making the minute choices.
I guess the way defensive rating works probably hurts the correlation, obscures the player quality difference since everyone gets same shot defense rating. (I guess BAPM overcomes a good share of this indirectly on average, rathet than through individual pkay by play accounting).
-
- Posts: 73
- Joined: Mon Apr 18, 2011 1:18 am
- Location: Philadelphia
- Contact:
Re: The popularization of BPM
It appears so. I was a little surprised by this; I was also surprised it didn't do better than PER in the far-out-of-sample tests. PER holds up strangely well, which I've always attributed to an effect related to this:Crow wrote:So Neil, for the bop based metric, the specific usage-efficiency tradeoff is essentially non-impactful?
http://www.basketball-reference.com/blog/?p=9776
Usage seems to be so much more "portable" than other traits -- and PER is so tied up in Usage -- that it's going to be more applicable out of sample. That's just my guess, though.