The purpose of these rankings are to measure efficiency per possession, well that's not all that matters in the NBA.
What else do you think matters and where does efficiency rank among those factors that you consider important? Not trying to start a flame war, just curious.
The purpose of these rankings are to measure efficiency per possession, well that's not all that matters in the NBA.
What else do you think matters and where does efficiency rank among those factors that you consider important? Not trying to start a flame war, just curious.
Here's an example:
Greg Oden's PER, RAPM, SPM, doesn't matter to Portland Trailblazer fans. They want to see the guy on the court. Some APBRmetricians completely forget about volume stats and the common fan forgets about efficiency. Both are important.
I've watched a lot of Laker games and Andrew Bynum is a great example. He's been injured every season since 2007. He gets to the 1500-2000 minute mark and that's it for him.
I don't see how you can reasonably have him #19 on this list already.
bbstats wrote:I'm pretty sure that nobody who reads rAPM ratings thinks to themselves, "these are dumb because Greg Oden is injured."
Kind of a non-sequiter.
What non-sequitur? I'm pretty sure the entire NBA can be ranked inappropriately given the wrong statistical foundation....... Greg is merely an example not alpha and omega.
Everyone should always consider the flaws and shortcomings of their data.
Thanks but there's nothing brilliant about the charts. It's just 15 lines of very straightforward python code. matplotlib is great for those things
In an act of shortsightedness I only used '11 in-season data to check if we need to multiply the prior with a factor. In this case it wasn't helpful. For "forecasting" entire older seasons, it is usually helpful though. In general, for forecasting entire seasons you want to either use higher lambdas or multiply the ratings with a factor (<1) before forecasting.
0.8 is a good factor to use, but there is one year where a factor as low as 0.5 did best: You would have had better forecasts if you split everyone's rating in half!
The ratings will be updated to reflect those changes sometime this week. Because I now a) use lower (absolute) priors and b) multiply the ratings with 0.8 before displaying them (because that gives better forecasts) player ratings will be a lot closer to zero
I will probably try a new way to penalize one of these days. I'll use 4 different lambdas. One for moving away from zero and one for moving away from the prior, for offense and defense. The idea behind that is we're OK with someone's rating rising from -3 to 0, but not so much with it rising from +3 to +6.
As always(haha, I wish), I'll make sure the new version is better at predicting out of sample data before replacing the old one with it.
One of these day I'll also run trough the PBPs to get some priors out of basic in-game individual statistics. I plan to handle a couple of things differently from the BoxScore though: Splitting Rebounds into regular and 'free throw', turnovers into 'live ball' and 'dead ball' etc. because I figure it will prove useful.
If someone has any ideas/requests on the new penalizing method, let me know
-Bonner pretty high in both versions
-Nowitzki at #3 even in uniformed, #1 in the other with a relatively big lead
-Garnett still looking good
-Conley top 20 in both
-Taj Gibson at ~#15 and #1
-Steph Curry top 15 in both
-Blake Griffin top 15 in both
-Amare with -2.7/-2.9 overall. He might be the worst starter for any playoff team
Other than that no real surprises in multiyear RAPM
Uninformed likes Harden (very much on offense, not so much on defense), Gallinari, Beno Udrih, John Lucas(!)
Rubio looks like the best rookie, Irving looks good on offense, bad on defense
A couple of days ago I was wondering if there was a way to find out how good we are, with the available methods like RAPM and APM, in measuring true player skill.
What keeps us from accurately determining player skill is
a) chance. Obviously there is chance involved in a basketball game, or the better team would always win. Sometimes a good player has a bad day etc.
Due to chance a player's true skill will most likely never be known (unless you clone the players and let them play 10.000's of games)
b) imperfect methods to estimate player skill
I'm not sure if my thought process was correct or if some of my assumptions were wrong, but I thought I would throw it out there for discussion.
To check for absolute/squared difference between true player skill and what we estimate I created hundreds of fake matchupfiles.
For every matchupfile I would create a fake player ratings (for offense and defense), with a random number generator, using gaussian distribution with mean 0, stddev of 2, which means
66% of the ratings are between -2 and +2, and 95% of the ratings are between -4 and +4.
I've seen arguments that NBA players represent only the tail end of a normal distribution, so this is probably a point of
discussion. I think standard APM results also show a normal distribution, even though APM doesn't assume it (RAPM assumes it through penalty values and quadratic shrinkage). APM would
probably report a higher standard deviation though. Assuming gaussian distribution might also give an unfair advantage to RAPM (in favor of LASSO or w/e)
Using the possession numbers of 2012 and the fake player ratings I created fake matchupfiles with a random element to it. 5-man-units with a high sum of offensive ratings had a higher
chance of scoring, but higher scoring was not guaranteed.
An average offense against an average defense had a 49.5% chance of scoring, which is close to league average.
A +5 offensive unit had 51.9% chance of scoring. The type of scoring (1's/2's/3's) was also determined through a random number generator.
I then let ridge regression estimate player values for the fake matchupfiles. For every matchupfile we *know* what the player ratings actually are, so our goal is to find the
regression technique which minimizes (squared) difference between actual player rating and estimated player rating.
Our error term describes the difference between what we estimate, and the player's true skill.
Now for some numbers, using RAPM with lambda = 3000:
Average absolute difference between estimated and true skill was 1.8. For offense or defense seperately it was 1.3
Root mean square error (RMSE) for estimated and true skill was 2.3. For offense or defense seperately it was 1.63
Assuming all players are +/- 0 gives you an RMSE of 2.0
For players with >3000 possessions (off+def) RAPM had the biggest problems with Morris, Markief (RMSE 2.7), Mbah a Moute, Luc; Wilkins, Damien;Brooks, Marshon;Meeks, Jodie;George, Paul;
Pekovic, Nikola;Haywood, Brendan;Jordan, DeAndre(2.6) while getting best results for Udoh, Ekpe (2.0);Parker, Tony;Thomas, Isaiah ;Crawford, Jamal ;Shumpert, Iman ;Garnett, Kevin ;
Nowitzki, Dirk ;Miller, Andre ;Fisher, Derek ;Griffin, Blake ;Terry, Jason (1.8)
I'm not sure if this is anything more than another tool to find shrinkage parameters or compare regression techniques, should one be to lazy to do crossvalidation.
With this method, the lambda that gave to lowest RMSE was the same lambda we usually find using CV: 3000.
Oh sorry. The fake players play the exact same amount of minutes and with the same players as the real ones did in '11-'12, just their rating has messed with. So a fake Nowitzki still plays only with Mavs players (fake Mavs, but still)