Re: The debut and popularization of BPM
Posted: Mon Feb 09, 2015 8:47 pm
Box Plus/Minus for the playoffs and the NCAAs has been launched.
Here are the updates to the About BPM page:
Updates
Box Plus/Minus Version 1.1: January 7, 2015:
This was a bug fix update to address a couple of issues. The most significant change is to correct the weighting scheme used in the regression. The original BPM regression used a sqrt(Poss) weighting, which was incorrect. This version corrects the scheme to a number of possessions weighting system, with an additional correction to account for the effect of the prior on the RAPM values.
This fix generally increases the spread in the BPM values slightly, with the best players getting a bump in the +0.5 range, while below average players saw a marginal reduction in their BPM values.
This fix also adjusted the values slightly, causing efficiency and blocks to be valued somewhat more highly and turnovers to be somewhat worse. Centers and other players with high shooting efficiency and blocks were helped the most by those tweaks (such as Anthony Davis), and players with high usage and relatively low efficiency (like Russell Westbrook) were hurt. The overall BPM gap between those helped the most and hurt the most was about 1.0. For example: 2014 Russell Westbrook saw his BPM drop by 0.2, but 2014 Anthony Davis saw his BPM increase by 0.8.
The other bug fixes were minor issues with the regression logic deriving the Offense/Defense split coefficients. There were two small bugs in the code that do not effect the overall BPM values, but will tweak the OBPM/DBPM split slightly. The effects of these corrections are small.
Comparison of the coefficients:
BPM:
OBPM/DBPM Split:
This update also includes a revision to how VORP is handled for partial seasons. Previously, partial seasons would show the player's production extrapolated to the full 82 game season–VORP was behaving more as a rate stat. That has now been changed, so that VORP will now act as a counting stat over the season, with each 1 point being equal to 1 point of season-end team point differential (per 82 games).
This was critical also for handling playoff VORP, which has been put on the same scale: [BPM – (-2.0)] * (% of minutes played)*(team games/82).
Playoff Box Plus/Minus and VORP
Box Plus/Minus for the playoffs is calculated the same way as BPM for the regular season, which a few additions to derive an appropriate team efficiency differential.
The playoff team efficiency, which is used in the team adjustment portion of the BPM calculation, is derived as follows:
Use playoff minutes distribution for each team along with each player's regular season BPM value to generate a "playoff team strength." In general, playoff rotations are shortened, so teams are stronger than in the regular season. If a player had fewer than 200 minutes of regular season play on the given team, they were assumed to be replacement level for this calculation.
Count games played against each opponent in the playoffs.
Strength of Schedule is the average opponent team rating for the duration of the playoffs.
Add actual playoff efficiency differential to the calculated strength of schedule to get the adjusted team efficiency differential used in the BPM calculation.
As an example, in 2013-2014 San Antonio won the title. They went through 7 games of Dallas (derived strength +4.1), 5 games of Portland (+6.5), 6 games of Oklahoma City (+10.5), and 5 games of Miami (+6.9), so their average strength of schedule was +6.9. Their raw efficiency differential was +10.0 in the playoffs, so their overall adjusted efficiency differential was a spectacular +16.9. That is the same scale as regular season efficiency differential–the Spurs were really dominant.
(This is reminiscent of Hollinger's playoff ratings. The 2001 Lakers had a playoff adjusted efficiency differential of +20.4 by this method.)
The top 10 playoff BPMs, minimum 500 minutes played in the playoffs:
LeBron was transcendent in 2009.
Playoff VORP is calculated the same way as regular season VORP, but based on the playoff BPM values calculated above. Be aware–players on teams that played more games will have higher VORP values. A team that swept several rounds may play several games fewer than a team that was taken to seven games a couple of times.
College Basketball Box Plus/Minus
Box Plus/Minus for college basketball is calculated using the same coefficients derived for the NBA. While it may be argued that college basketball is somewhat different, there is no easy way to derive BPM coefficients specifically for college basketball, so the NBA coefficients will have to suffice. A rebound is still a rebound... The coefficient that could be the most questionable would be the MPG coefficient, because it is unclear if minutes distribution at the college level is based upon the same criteria as at the pro level, and the length and pace of games is different. Until further information becomes available, all coefficients have been used as-is.
VORP, on the other hand, does not make sense for college basketball. VORP is derived based on salaries, and in a consistent market, and is primarily useful in relation to evaluating salaries. In college, on the other hand, every school and conference has widely disparate situations, and since there are no salaries, their is neither a rational method nor strong need for deriving or using VORP.
Therefore, BPM will be shown alone for college basketball.
Data for college basketball is currently available only back to the 2011 season. Here are the top 10 seasons in the database, minimum 500 minutes played.
A few notes on BPM for college basketball:
Big men tend to rank more highly than guards–it appears that a big man can dominate on defense more in college than in the NBA. In college, there are some ridiculous block rates for elite centers. Are they overrated, or does this reflect reality? There is no easy way to know for sure.
Beware of partial season results. Because of imbalanced schedules, with many easy games early in the season for top teams, players who compile great stats early in the season (along with their whole team) but then get hurt and miss conference play, where their teammates' stats drop down, will often see inflated BPM numbers because their numbers look so much better than their teammates.
Beware of crazy outliers. Mike Hart, above, is one. A "quintessential glue guy", he never shot the ball at all, but made the few shots he did shoot. He never, ever turned the ball over, either, but rebounded, passed, got a lot of steals…. His numbers are stretching the interaction terms in BPM past their breaking point, particularly on offense. He had just enough minutes to qualify for the list above.
---
This bugfix addresses the iissue found earlier in this thread, where I was weighting the regression by sqrt(possessions) rather than by simply possessions.
Here are the updates to the About BPM page:
Updates
Box Plus/Minus Version 1.1: January 7, 2015:
This was a bug fix update to address a couple of issues. The most significant change is to correct the weighting scheme used in the regression. The original BPM regression used a sqrt(Poss) weighting, which was incorrect. This version corrects the scheme to a number of possessions weighting system, with an additional correction to account for the effect of the prior on the RAPM values.
This fix generally increases the spread in the BPM values slightly, with the best players getting a bump in the +0.5 range, while below average players saw a marginal reduction in their BPM values.
This fix also adjusted the values slightly, causing efficiency and blocks to be valued somewhat more highly and turnovers to be somewhat worse. Centers and other players with high shooting efficiency and blocks were helped the most by those tweaks (such as Anthony Davis), and players with high usage and relatively low efficiency (like Russell Westbrook) were hurt. The overall BPM gap between those helped the most and hurt the most was about 1.0. For example: 2014 Russell Westbrook saw his BPM drop by 0.2, but 2014 Anthony Davis saw his BPM increase by 0.8.
The other bug fixes were minor issues with the regression logic deriving the Offense/Defense split coefficients. There were two small bugs in the code that do not effect the overall BPM values, but will tweak the OBPM/DBPM split slightly. The effects of these corrections are small.
Comparison of the coefficients:
BPM:
Code: Select all
╔═════════╦══════════════════════╦════════════════╦═════════════════╦═════════════╦════════════╗
║ Coeff. ║ Term ║ BPM 1.1 Value ║ Original Value ║ Difference ║ Percentage ║
╠═════════╬══════════════════════╬════════════════╬═════════════════╬═════════════╬════════════╣
║ a ║ Regr. MPG ║ 0.123391 ║ 0.120051 ║ 0.0033 ║ 2.8% ║
║ b ║ ORB% ║ 0.119597 ║ 0.137600 ║ -0.0180 ║ -13.1% ║
║ c ║ DRB% ║ -0.151287 ║ -0.151938 ║ 0.0007 ║ -0.4% ║
║ d ║ STL% ║ 1.255644 ║ 1.144182 ║ 0.1115 ║ 9.7% ║
║ e ║ BLK% ║ 0.531838 ║ 0.449468 ║ 0.0824 ║ 18.3% ║
║ f ║ AST% ║ -0.305868 ║ -0.310548 ║ 0.0047 ║ -1.5% ║
║ g ║ TO%*USG% ║ 0.921292 ║ 0.723784 ║ 0.1975 ║ 27.3% ║
║ h ║ Scoring ║ 0.711217 ║ 0.610605 ║ 0.1006 ║ 16.5% ║
║ i ║ AST Interaction ║ 0.017022 ║ 0.019936 ║ -0.0029 ║ -14.6% ║
║ j ║ 3PAr Interaction ║ 0.297639 ║ 0.380536 ║ -0.0829 ║ -21.8% ║
║ k ║ Threshold Scoring ║ 0.213485 ║ 0.269667 ║ -0.0562 ║ ║
║ l ║ sqrt(AST%*TRB%) ║ 0.725930 ║ 0.691501 ║ 0.0344 ║ 5.0% ║
╚═════════╩══════════════════════╩════════════════╩═════════════════╩═════════════╩════════════╝
Code: Select all
╔═════════╦══════════════════════╦════════════════════╦═════════════════╦═════════════╦════════════╗
║ Coeff. ║ Term ║ O/D BPM 1.1 Value ║ Original Value ║ Difference ║ Percentage ║
╠═════════╬══════════════════════╬════════════════════╬═════════════════╬═════════════╬════════════╣
║ a ║ Regr. MPG ║ 0.064448 ║ 0.059270 ║ 0.005178 ║ 8.7% ║
║ b ║ ORB% ║ 0.211125 ║ 0.197487 ║ 0.013638 ║ 6.9% ║
║ c ║ DRB% ║ -0.107545 ║ -0.102144 ║ -0.005401 ║ 5.3% ║
║ d ║ STL% ║ 0.346513 ║ 0.322082 ║ 0.024431 ║ 7.6% ║
║ e ║ BLK% ║ -0.052476 ║ -0.062684 ║ 0.010208 ║ -16.3% ║
║ f ║ AST% ║ -0.041787 ║ -0.088460 ║ 0.046673 ║ -52.8% ║
║ g ║ TO%*USG% ║ 0.932965 ║ 0.798831 ║ 0.134134 ║ 16.8% ║
║ h ║ Scoring ║ 0.687359 ║ 0.606303 ║ 0.081056 ║ 13.4% ║
║ i ║ AST Interaction ║ 0.007952 ║ 0.011822 ║ -0.003870 ║ -32.7% ║
║ j ║ 3PAr Interaction ║ 0.374706 ║ 0.430225 ║ -0.055519 ║ -12.9% ║
║ k ║ Threshold Scoring ║ -0.181891 ║ -0.126574 ║ -0.055317 ║ ║
║ l ║ sqrt(AST%*TRB%) ║ 0.239862 ║ 0.262148 ║ -0.022286 ║ -8.5% ║
╚═════════╩══════════════════════╩════════════════════╩═════════════════╩═════════════╩════════════╝
This was critical also for handling playoff VORP, which has been put on the same scale: [BPM – (-2.0)] * (% of minutes played)*(team games/82).
Playoff Box Plus/Minus and VORP
Box Plus/Minus for the playoffs is calculated the same way as BPM for the regular season, which a few additions to derive an appropriate team efficiency differential.
The playoff team efficiency, which is used in the team adjustment portion of the BPM calculation, is derived as follows:
Use playoff minutes distribution for each team along with each player's regular season BPM value to generate a "playoff team strength." In general, playoff rotations are shortened, so teams are stronger than in the regular season. If a player had fewer than 200 minutes of regular season play on the given team, they were assumed to be replacement level for this calculation.
Count games played against each opponent in the playoffs.
Strength of Schedule is the average opponent team rating for the duration of the playoffs.
Add actual playoff efficiency differential to the calculated strength of schedule to get the adjusted team efficiency differential used in the BPM calculation.
As an example, in 2013-2014 San Antonio won the title. They went through 7 games of Dallas (derived strength +4.1), 5 games of Portland (+6.5), 6 games of Oklahoma City (+10.5), and 5 games of Miami (+6.9), so their average strength of schedule was +6.9. Their raw efficiency differential was +10.0 in the playoffs, so their overall adjusted efficiency differential was a spectacular +16.9. That is the same scale as regular season efficiency differential–the Spurs were really dominant.
(This is reminiscent of Hollinger's playoff ratings. The 2001 Lakers had a playoff adjusted efficiency differential of +20.4 by this method.)
The top 10 playoff BPMs, minimum 500 minutes played in the playoffs:
Code: Select all
╔═════╦═══════╦══════╦══════════════════════╦══════╗
║ Rk ║ Year ║ Tm ║ Player ║ BPM ║
╠═════╬═══════╬══════╬══════════════════════╬══════╣
║ 1 ║ 2009 ║ CLE ║ LeBron James ║ 18.2 ║
║ 2 ║ 1977 ║ LAL ║ Kareem Abdul-Jabbar ║ 14.8 ║
║ 3 ║ 1990 ║ CHI ║ Michael Jordan ║ 14.3 ║
║ 4 ║ 1991 ║ CHI ║ Michael Jordan ║ 13.8 ║
║ 5 ║ 1989 ║ CHI ║ Michael Jordan ║ 12.8 ║
║ 6 ║ 1976 ║ NYA ║ Julius Erving ║ 12.5 ║
║ 7 ║ 2008 ║ NOH ║ Chris Paul ║ 12.2 ║
║ 8 ║ 1991 ║ PHI ║ Charles Barkley ║ 11.8 ║
║ 9 ║ 1975 ║ INA ║ George McGinnis ║ 11.6 ║
║ 10 ║ 2003 ║ SAS ║ Tim Duncan ║ 11.6 ║
╚═════╩═══════╩══════╩══════════════════════╩══════╝
Playoff VORP is calculated the same way as regular season VORP, but based on the playoff BPM values calculated above. Be aware–players on teams that played more games will have higher VORP values. A team that swept several rounds may play several games fewer than a team that was taken to seven games a couple of times.
College Basketball Box Plus/Minus
Box Plus/Minus for college basketball is calculated using the same coefficients derived for the NBA. While it may be argued that college basketball is somewhat different, there is no easy way to derive BPM coefficients specifically for college basketball, so the NBA coefficients will have to suffice. A rebound is still a rebound... The coefficient that could be the most questionable would be the MPG coefficient, because it is unclear if minutes distribution at the college level is based upon the same criteria as at the pro level, and the length and pace of games is different. Until further information becomes available, all coefficients have been used as-is.
VORP, on the other hand, does not make sense for college basketball. VORP is derived based on salaries, and in a consistent market, and is primarily useful in relation to evaluating salaries. In college, on the other hand, every school and conference has widely disparate situations, and since there are no salaries, their is neither a rational method nor strong need for deriving or using VORP.
Therefore, BPM will be shown alone for college basketball.
Data for college basketball is currently available only back to the 2011 season. Here are the top 10 seasons in the database, minimum 500 minutes played.
Code: Select all
╔═════╦═══════╦═════════════════╦══════════════════════╦═══════╦═══════╦══════╗
║ Rk ║ Year ║ Team ║ Player ║ BPM ║ OBPM ║ DBPM ║
╠═════╬═══════╬═════════════════╬══════════════════════╬═══════╬═══════╬══════╣
║ 1 ║ 2012 ║ Kentucky ║ Anthony Davis ║ 18.7 ║ 7.8 ║ 10.9 ║
║ 2 ║ 2013 ║ Indiana ║ Victor Oladipo ║ 17.0 ║ 9.7 ║ 7.3 ║
║ 3 ║ 2013 ║ Louisville ║ Gorgui Dieng ║ 15.0 ║ 4.0 ║ 11.0 ║
║ 4 ║ 2014 ║ Kansas ║ Joel Embiid ║ 14.9 ║ 4.7 ║ 10.2 ║
║ 5 ║ 2014 ║ Kentucky ║ Willie Cauley-Stein ║ 14.9 ║ 4.1 ║ 10.8 ║
║ 6 ║ 2012 ║ Marquette ║ Jae Crowder ║ 14.7 ║ 8.8 ║ 5.9 ║
║ 7 ║ 2013 ║ Kentucky ║ Nerlens Noel ║ 14.6 ║ 2.9 ║ 11.7 ║
║ 8 ║ 2012 ║ Kansas ║ Jeff Withey ║ 13.8 ║ 2.7 ║ 11.1 ║
║ 9 ║ 2011 ║ Michigan State ║ Draymond Green ║ 13.6 ║ 6.3 ║ 7.3 ║
║ 10 ║ 2013 ║ Gonzaga ║ Mike Hart ║ 13.5 ║ 7.9 ║ 5.6 ║
╚═════╩═══════╩═════════════════╩══════════════════════╩═══════╩═══════╩══════╝
Big men tend to rank more highly than guards–it appears that a big man can dominate on defense more in college than in the NBA. In college, there are some ridiculous block rates for elite centers. Are they overrated, or does this reflect reality? There is no easy way to know for sure.
Beware of partial season results. Because of imbalanced schedules, with many easy games early in the season for top teams, players who compile great stats early in the season (along with their whole team) but then get hurt and miss conference play, where their teammates' stats drop down, will often see inflated BPM numbers because their numbers look so much better than their teammates.
Beware of crazy outliers. Mike Hart, above, is one. A "quintessential glue guy", he never shot the ball at all, but made the few shots he did shoot. He never, ever turned the ball over, either, but rebounded, passed, got a lot of steals…. His numbers are stretching the interaction terms in BPM past their breaking point, particularly on offense. He had just enough minutes to qualify for the list above.
---
This bugfix addresses the iissue found earlier in this thread, where I was weighting the regression by sqrt(possessions) rather than by simply possessions.