Appr. 5.x year reg. adj. +/- (J.E., 2010)
Posted: Fri Apr 15, 2011 12:55 am
recovered page 1 of 5
PostPosted: Wed Dec 15, 2010 3:50 pm Post subject: Reply with quote
back2newbelf wrote:
This season only:
http://www.docdroid.net/5ka/2011.xls.html
Format: Offense(100 possessions)|Defense(100 possessions)|Sum
That early in the season it is obviously of limited use and will produce funny results
So far we have:
offensive player of the year: Hedo Turkoglu (he is also the worst defender though)
defensive player of the year: Darrell Arthur
While it kind of agrees with the media on the MVP race, Nowitzki/Garnett/Ginobli/super-friends all look good, it couldn't disagree more on the Rookie-of-the-year-race, putting Jeff Adrian, Landry Fields and Evan Turner at the top. John Wall is supposed to be 9th worst of all players, Griffin 7th worst
From looking at the top rated players one would think the best basketball age is 35
Also, Shane Battier is suddenly listed as a horrible defender and Chuck Hayes, who used to rock this rating, is the 10th worst player in the league
I haven't figured out how to do this, but would it be possible to use ASPM ratings as a Bayesian prior?
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mathayus
PostPosted: Sun Dec 26, 2010 4:16 pm Post subject: Reply with quote
back2newbelf wrote:
While it kind of agrees with the media on the MVP race, Nowitzki/Garnett/Ginobli/super-friends all look good, it couldn't disagree more on the Rookie-of-the-year-race, putting Jeff Adrian, Landry Fields and Evan Turner at the top. John Wall is supposed to be 9th worst of all players, Griffin 7th worst
In my experience, star rookies very rarely look impressive by +/- metrics. This has led me to conclude that if we truly gave the ROY to the MVP of rookies, in most years it would go to a player who happened to fill a niche on a successful team instead of the big name rookies.
While there would be nothing inherently wrong with that, the big name rookies are actually the ones who tend to go on and become stars by +/- metrics, so focusing on the volume statistics instead of +/- statistics for rookies does serve a useful purpose.
Back to top
View user's profile Send private message
back2newbelf
PostPosted: Sun Dec 26, 2010 6:08 pm Post subject: Reply with quote
mathayus wrote:
In my experience, star rookies very rarely look impressive by +/- metrics. This has led me to conclude that if we truly gave the ROY to the MVP of rookies, in most years it would go to a player who happened to fill a niche on a successful team instead of the big name rookies.
While there would be nothing inherently wrong with that, the big name rookies are actually the ones who tend to go on and become stars by +/- metrics, so focusing on the volume statistics instead of +/- statistics for rookies does serve a useful purpose.
Good point.
I think what also needs to be done is to never use single-season (R)APM to judge rookies. The top rookies will usually play heavy minutes on very bad teams. When RAPM "doesn't know" that the players the rookie is currently playing with already sucked the year before it puts part of the blame on him. This can be avoided with multi-season (R)APM
Back to top
View user's profile Send private message
back2newbelf
PostPosted: Tue Jan 18, 2011 12:01 pm Post subject: Reply with quote
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Back to top
View user's profile Send private message
DSMok1
PostPosted: Tue Jan 18, 2011 12:17 pm Post subject: Reply with quote
back2newbelf wrote:
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Thanks a lot for this data!
Could you please post the standard errors for each estimate as well? The lack of standard errors makes it very difficult to use this data for additional research!
I find it interesting and expected that the Lambdas broke down the way they did: players regress far more to the mean on defense, as that is a more unstable measure, while coaches have more of an impact on defense than offense.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
deepak
PostPosted: Tue Jan 18, 2011 10:44 pm Post subject: Reply with quote
back2newbelf wrote:
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Appreciate it.
I got the following correlation table between your current season RAPM and some various per-minute boxscore statistics:
Code:
Age MPG GmSc USG ORB DRB PPR BLK+STL PTS OFF DEF
OFF 0.122 0.405 0.530 0.206 -0.049 0.039 0.243 -0.005 0.371 1.000 -0.009
DEF 0.139 -0.074 -0.025 -0.152 0.031 0.132 -0.017 0.171 -0.117 -0.009 1.000
all boxscore stats are per 40 minutes
GmSc = PTS + 0.4 * FG - 0.7 * FGA - 0.4*(FTA - FT) + 0.7 * ORB + 0.3 * DRB +
STL + 0.7 * AST + 0.7 * BLK - 0.4 * PF - TOV
USG = FGA + 0.44*FTA + TOV
PPR = 0.7*AST - TOV
Question: Are coaches overly biased towards offensive players, or does RAPM overrate the value of defensive players?
page 2
Author Message
back2newbelf
PostPosted: Wed Jan 19, 2011 1:40 pm Post subject: Reply with quote
deepak wrote:
cool table! Could you do it for GmSc without DRebs, steals and blocks?
Back to top
View user's profile Send private message
Ilardi
PostPosted: Wed Jan 19, 2011 5:44 pm Post subject: Reply with quote
back2newbelf wrote:
deepak wrote:
cool table! Could you do it for GmSc without DRebs, steals and blocks?
And PER?
Back to top
View user's profile Send private message
Crow
PostPosted: Wed Jan 19, 2011 5:48 pm Post subject: Reply with quote
Deepak,
I'd also be interested in seeing the correlations for offensive and defensive splits of Game Score.
And seeing how far you could possibly up the correlation with multi-season Adjusted +/- by optimizing these linear weights, with an additional variable for capturing the residual uncaptured in Game Score shot defense. What is the average absolute value of that shot defense variable?
Adjusted +/- is not perfect, it is an estimate with error. But an optimized to max correlation with Adjusted +/- Game Score could be worth seeing as an intermediate product between the existing linear weight boxscore based metric and Adjusted +/-. The weights may or may not be stable thru different periods but it would be suggestive about whether the linear weights should be changed to try to get closer over the long-run (and maybe also where the Adjusted +/- errors might be higher?). While I am suggesting here doing it with the simple Game Score, ideally such a comparison would be done with newer / probably better boxscore (and play by play) based metrics.
Back2newbelf,
Might your friend be interested in doing a RAPM run that just looked at when the top 8-16 teams play in regular season games against each other? I think it would be useful to see where values from that split vary considerably from the league-wide RAPM.
Any interest in say a 3 season playoffs only run? Or a run where the playoff data was included with regular season data but had a somewhat higher or significantly higher weight?
Or how about preparing up to date multi-season Adjusted +/- splits down to the 4 Factor level?
Back to top
View user's profile Send private message
acollard
PostPosted: Wed Jan 19, 2011 7:27 pm Post subject: Reply with quote
Crow wrote:
Adjusted +/- is not perfect, it is an estimate with error. But an optimized to max correlation with Adjusted +/- Game Score could be worth seeing as an intermediate product between the existing linear weight boxscore based metric and Adjusted +/-. The weights may or may not be stable thru different periods but it would be suggestive about whether the linear weights should be changed to try to get closer over the long-run (and maybe also where the Adjusted +/- errors might be higher?). While I am suggesting here doing it with the simple Game Score, ideally such a comparison would be done with newer / probably better boxscore (and play by play) based metrics.
This is a pretty similar suggestion to DSMOK1's suggestion of using ASPM as a Bayesian prior, and both seem pretty great. I think you could get a more consistent result if you used a statistical approach as a jumping off point for Adjusted +/-.
On the other hand, one of the coolest and best things about Adjusted +/- is the guys who stand out for seemingly unclear reasons, without a lot of box score stats to back it up. Sometimes, its garbage, but other times, it can point toward some surprising truth. This quality would be largely diminished if you somehow averaged or weighted Adjusted +/- with a statistical approach.
I'm also unsure how I feel about using coaches in the +/- formula. Does it make anyone else uneasy? I feel like there just isn't enough data with most or all of them for it to be useful. What's the highest number of coaches a team has had in the past five years? 3? And what about the most amount of teams a coach has coached with? 2? It seems like the coach variable as its used may have a lot of other things about the team he coaches that correlate with him like team environment, chemistry, medical staff, player synergies, home crowd, etc. that may or may not have a lot to do with the coach himself.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Wed Jan 19, 2011 8:18 pm Post subject: Reply with quote
Any form of metric with Adjusted +/- weighted with or informed by "a statistical approach" could and I think should be a 3rd leg to the 2 "pure" approaches / metrics. You don't have to dispose of the originals and I wouldn't.
I'd glad to see the data with coaches at least once, though I'd probably use without coaches more given the concerns you raised. Having both would, again, allow the comparisons and help spot aberrant / potentially interesting stuff.
It also would be great to have, in public on a regular basis, player pair Adjusted +/-. Comparing the individual ratings with the pair would be suggestive about specific player to player interactions / impacts.
Conceivably you could have player / coach Adjusted +/- pairs too.
The errors might be too high for many to make much of player / opposing player or opposing coach pairs except maybe in a 4-6 season version, but it is also conceivable. Again, if you are searching for interesting things, it might be fun to at least see and maybe more than that.
Last edited by Crow on Thu Jan 20, 2011 2:18 pm; edited 1 time in total
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Thu Jan 20, 2011 6:13 am Post subject: Reply with quote
Crow wrote:
Or how about preparing up to date multi-season Adjusted +/- splits down to the 4 Factor level?
I want to do this sometime with TS%, OReb% and Tov%
Quote:
It also would be great to have in public on a regular basis would be player pair Adjusted +/-. Comparing the individual ratings with the pair would be suggestive about specific player to player interactions / impacts.
Conceivably you could have player / coach Adjusted +/- pairs too.
This is also on my to-do-list
allocard wrote:
I'm also unsure how I feel about using coaches in the +/- formula. Does it make anyone else uneasy? I feel like there just isn't enough data with most or all of them for it to be useful. What's the highest number of coaches a team has had in the past five years? 3? And what about the most amount of teams a coach has coached with? 2?
Player trades help here. A coach might have just coached two teams but there's a good possibility he coached 40+ players.
One big problem with including coaches is probably aging. Kuester has to work with an older (and probably worse) Ben Wallace and Hamilton than coaches before him, but the algorithm thinks they're the same player and punishes Kuester for it
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Thu Jan 20, 2011 2:21 pm Post subject: Reply with quote
Appreciate the shared public data so far and look forward to the additional variations and extensions of Adjusted +/- that you and your friend decide to prepare.
Back to top
View user's profile Send private message
acollard
Joined: 22 Sep 2010
Posts: 49
Location: MA
PostPosted: Thu Jan 20, 2011 5:54 pm Post subject: Reply with quote
[quote=back2newbelf]One big problem with including coaches is probably aging. Kuester has to work with an older (and probably worse) Ben Wallace and Hamilton than coaches before him, but the algorithm thinks they're the same player and punishes Kuester for it[/quote]
Isn't this the same problem for player adjusted +/- over long periods? Players who play with aging superstars or soon to be superstars are devalued, and perhaps players who played with a now declining superstar who was still in his prime are overvalued?
I wonder if adjusted +/- would or could be able to use aging curves in any way, to help avoid errors like this? It seems like it could be useful.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Fri Jan 21, 2011 6:50 am Post subject: Reply with quote
Updated this years' ranking and added 2 year ranking http://stats-for-the-nba.appspot.com/2-year-ranking
George Hill looks suprisingly good in the one-year-ranking. Going by 82games, he does have a good On/Off rating and, sorted by minutes, 2 of his top 3 5-man-units do not involve Ginobili who has the Spurs' best On/Off
acollard wrote:
Isn't this the same problem for player adjusted +/- over long periods? Players who play with aging superstars or soon to be superstars are devalued, and perhaps players who played with a now declining superstar who was still in his prime are overvalued?
yes that's true
Quote:
I wonder if adjusted +/- would or could be able to use aging curves in any way, to help avoid errors like this? It seems like it could be useful.
I'm sure it's possible and I will probably do this sometime when I'm less busy
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 547
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jan 21, 2011 7:03 am Post subject: Reply with quote
You'd probably have to generate the aging curves ahead of time, apply them in a "preprocessing" phase, and then run the regression. I generated a fairly good aging curve for ASPM, which should look the same as APM. It's at http://sonicscentral.com/apbrmetrics/vi ... php?t=2652 .
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Visit poster's website
EvanZ
Joined: 22 Nov 2010
Posts: 188
PostPosted: Fri Jan 21, 2011 8:38 am Post subject: Reply with quote
I put the 2yr data up as a .csv file on GoogleDocs. Added rank as the first column:
https://spreadsheets.google.com/pub?key ... output=csv
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Fri Jan 21, 2011 3:25 pm Post subject: Reply with quote
Thanks for the 2 season RAPM.
It would be handy to have the team identifiers in the file too for sorting, though users can cobble it together.
I can't recall for sure if Joe Sill used age curves in his RAPM. I think he might have. I also think I recall Steve Ilardi talking about doing it in his next private version of Adjusted +/-.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Fri Jan 21, 2011 4:37 pm Post subject: Reply with quote
I matched up this 2 season RAPM with basketballvalue's 2 season traditional APM for 328 players with values in both. The r2 was .56, lower than I'd hoped to see.
I also look at the r2 from just guys +4 or better and it was .25. Between +4 and -4, it was .18. Less than -4, .04.
What do you make of this?
What should be done?
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Fri Jan 21, 2011 5:02 pm Post subject: Reply with quote
Crow wrote:
What should be done?
About what? It's not exactly my goal to produce numbers that correlate well with traditional APM
Back to top
View user's profile Send private message
Ilardi
Joined: 15 May 2008
Posts: 257
Location: Lawrence, KS
PostPosted: Fri Jan 21, 2011 5:47 pm Post subject: Reply with quote
Crow wrote:
I matched up this 2 season RAPM with basketballvalue's 2 season traditional APM for 328 players with values in both. The r2 was .56, lower than I'd hoped to see.
I also look at the r2 from just guys +4 or better and it was .25. Between +4 and -4, it was .18. Less than -4, .04.
What do you make of this?
What should be done?
Crow, your R^2 of .56 implies a zero-order correlation (r) between RAPM and APM of .75, with both estimates derived from only 1.5 years of data. That's surprisingly high, imho.
The lower R^2 numbers at higher/lower values of APM is most likely a "truncated range" phenomenon.
page 3
Author Message
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Fri Jan 21, 2011 5:48 pm Post subject: Reply with quote
back2newbelf,
I accept that it is not exactly your goal to produce numbers that correlate well with traditional APM. That is not the goal, the goal is estimating true impact.
Nonetheless I wanted to ask a few simple open-ended questions to possibly hear further thoughts on the metric comparison from whomever wanted to share them.
Currently I'll look at and weigh the estimates from 2 year or longer traditional APM or RAPM and preferably RAPM.
That these versions can vary a fair amount is something I've been noting case by case for awhile when I find it. It is to be expected to a degree but I do think there was value to checking the correlation.
If an even better version of APM can be constructed with comparison and discussion, I'd say great.
Steve,
Yes the r was .747. I reported the r2 because I had gotten the impression that was preferred. Maybe I have some previous r's and r2's reported for metric comparisons scrambled in my mind but I was under the impression that an r2 of .56 was pretty good but not real strong and since these metrics are at the core the same type method I thought it might be higher. Maybe my expectations were too high and I will consider your reaction. Perhaps the correlation would be even higher for a longer time period as you suggest or if other authored versions of traditional APM or RAPM are used. I don't fully understand the lamba value issue but that may be part of it as is the minute cutoff choice.
Just noting what I see since I don't recall a recent comparison of traditional and RAPM values, especially at the 2 year level, and any comparative discussion at Joe's site is gone. Maybe there is some dialog here that could be dug up. But there probably is still room for it to continue.
I was thinking it might not be surprising for the truncated ranges to have lower correlations for the segments but thanks for the reinforcement. I am not surprised that the correlations were stronger in the top segment than the middle or the bottom, but I thought that might be worthing noting too.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Sat Jan 22, 2011 3:09 pm Post subject: Reply with quote
Players whose 1.5 season traditional APM is 5 or more points higher than this RAPM
Fields, Landry 11.5
Collins, Jason 10.1
Nash, Steve 9.6
Aldridge, LaMarcus 9.4
Dooling, Keyon 8.2
Bass, Brandon 8.2
Nowitzki, Dirk 8.0
Wallace, Gerald 8.0
Gasol, Pau 7.3
Rose, Derrick 7.1
Johnson, Amir 7.0
West, David 6.8
Lopez, Brook 6.4
Fesenko, Kyrylo 6.2
Favors, Derrick 5.9
Chandler, Tyson 5.9
Dunleavy, Mike 5.8
Carter, Vince 5.7
Paul, Chris 5.4
Dorsey, Joey 5.4
Brockman, Jon 5.3
Johnson, Wesley 5.2
Young, Thaddeus 5.1
Gordon, Ben 5.0
Players whose 1.5 season traditional APM is 5 or more points lower than this RAPM
Belinelli, Marco -5.0
Cousins, DeMarcus -5.0
Jones, Solomon -5.0
Wall, John -5.1
Landry, Carl -5.2
Forbes, Gary -5.2
Bayless, Jerryd -5.3
Vasquez, Greivis -5.3
Nocioni, Andres -5.3
Jack, Jarrett -5.4
Jackson, Stephen -5.6
Outlaw, Travis -5.6
Krstic, Nenad -5.8
Williams, Terrence -5.8
Andersen, Chris -5.8
Splitter, Tiago -5.9
Livingston, Shaun -6.0
Boykins, Earl -6.1
Collison, Darren -6.2
Ridnour, Luke -6.3
Matthews, Wes -6.3
Marion, Shawn -6.3
Moon, Jamario -6.3
Chalmers, Mario -6.4
Williams, Jawad -6.4
Milicic, Darko -6.5
Graham, Stephen -6.5
Bledsoe, Eric -7.1
Carney, Rodney -7.2
Maggette, Corey -7.2
Sanders, Larry -7.3
Armstrong, Hilton -7.5
Bryant, Kobe -7.5
Dragic, Goran -7.6
Evans, Maurice -7.6
Webster, Martell -7.7
Powell, Josh -7.7
Bell, Raja -7.8
McGuire, Dominic -7.8
Gortat, Marcin -7.9
Telfair, Sebastian -7.9
Douglas-Roberts, Chris -8.2
Ellington, Wayne -8.5
Hill, Jordan -8.5
Richardson, Jason -8.8
Mason, Roger -9.9
Erden, Semih -9.9
Brown, Kwame -10.0
Law, Acie -10.1
Arroyo, Carlos -10.4
Monroe, Greg -11.1
Batum, Nicolas -11.4
Felton, Raymond -11.8
Thabeet, Hasheem -11.8
House, Eddie -12.1
Hayward, Gordon -12.4
About 25% of the players compared are in one of the 2 groups, so 75% of the estimates are within 5 points of each other. More than 50% were within 3 points of each other. About 40% within 2 points.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Sun Jan 23, 2011 7:38 am Post subject: Reply with quote
What's not accounted for in adjusted +/-?
Forcing good opponent players to the bench because he just fouled you!
I split all players into 3 groups according to their 2-year rating: [>+1.0, +1.0> & >-1.0, <-1.0], then used basketballgeek.com's 2009/2010 data to compute how many times a player was fouled by players of the different groups
http://stats-for-the-nba.appspot.com/fouling
minimum 50 possessions.
The analysis is far from perfect. One problem is that garbage time players can, for the most part, only be fouled by other garbage time players. Thus they will never look good in the "being fouled by >+1.0" category.
One other problem is that all fouls get treated the same, when in reality it's probably better to make someone pick up his second foul in the 1st quarter, rather than make him pick up his fourth foul with 10 seconds to play in the game
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jan 24, 2011 4:42 pm Post subject: Reply with quote
Would it be possible to do this at a lineup level?
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Mon Jan 24, 2011 5:10 pm Post subject: Reply with quote
DSMok1 wrote:
Would it be possible to do this at a lineup level?
You need to be a little more clear. Are you talking about fouling? Do you want the lineups that get fouled by certain players, or the players that get fouled by certain lineups? Something else?
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jan 24, 2011 5:32 pm Post subject: Reply with quote
back2newbelf wrote:
DSMok1 wrote:
Would it be possible to do this at a lineup level?
You need to be a little more clear. Are you talking about fouling? Do you want the lineups that get fouled by certain players, or the players that get fouled by certain lineups? Something else?
No, sorry, I meant the RAPM at the lineup level, like Basketball Value does, but with the lambda-based ridge regression applied.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Tue Jan 25, 2011 11:53 am Post subject: Reply with quote
Now with Euroleague approximated RAPM at http://stats-for-the-nba.appspot.com/euroleague-ranking Last season and this season combined. Optimal lambda was, again, 3000.
Rubio looks pretty good
Thanks to http://www.in-the-game.org for providing the data
DSMok1 wrote:
No, sorry, I meant the RAPM at the lineup level, like Basketball Value does, but with the lambda-based ridge regression applied.
Certainly possible, but not exactly at the top of my to-do list (that would be adj. four factors and player pairs)
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Tue Jan 25, 2011 12:10 pm Post subject: Reply with quote
Excellent once again! I'm sure we're getting repetitive saying that over and over...
Very Happy
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Tue Jan 25, 2011 12:41 pm Post subject: Reply with quote
Rubio unimpressive to me on individual statistical measurement (70th Ranking at hoopsstats) but this RAPM has him at +1.9, 11th best in Euroleague.
He is helping optimize his teammates further on offense and also contributing a bit on defense.
Would he optimize NBA teammates on offense at the same level or more or less? I'd think his defensive impact would be less in the NBA than Euroleague but that is a surface reaction. Not that the test is coming that soon or will be that important but I wanted to touch on it given recent articles.
If RAPM were done for additional earlier Euroleague seasons then some Euroleague RAPM to NBA RAPM comparisons could be done now for guys who came over here. I guess it could be done the other way too. Not sure how much I'd weight general translation projections from one league to another that heavily in a specific player's evaluation but would be good to see the averages and to gather as many examples as possible. See what it says, what you think it says and what results you (and others) get with one approach over time or another.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Fri Jan 28, 2011 5:43 am Post subject: Reply with quote
I think I found a way to compute standard errors via bootstrapping. Not 100% sure if this is correct though.
Bootstrap sample: From our n observations take n independent draws with replacement.
Then use a Monte Carlo algorithm:
(1)using a random number generator, independently draw a large number of bootstrap samples (B)
(2)for each bootstrap sample evaluate the statistic of interest
(3)calculate the standard deviation of the values
Right now it's only available for the 2-year ranking http://stats-for-the-nba.appspot.com/2-year-ranking
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jan 28, 2011 10:27 am Post subject: Reply with quote
I don't think it's working right... Sad
The standard errors should be highest on the players with the least data... but it is reversed here. The players that have the least data have their results dominated by the lambda, and thus return a low stdev on the bootstrapping.
What should happen is that the players with essentially no data should have a standard error equal to the standard deviation of the over-all distribution of NBA players, or, I should say, it's based on the lambda. It's a Bayesian deal: the prior is 0, and the lambda defines the spread (I'm not sure how to convert lambda to standard deviation). Then the player data is applied, narrowing the standard error of the estimate.
That's about all I know... Oh, the errors should probably be in the range of 1.5 for the best-known player ranging up to like 6 or 7 for players with no data.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Tue Feb 01, 2011 2:12 pm Post subject: Reply with quote
Are you resampling with replacement, or without?
Back to top
View user's profile Send private message Send e-mail AIM Address
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Wed Feb 02, 2011 3:43 pm Post subject: Reply with quote
gabefarkas wrote:
Are you resampling with replacement, or without?
With replacement. Why do you ask?
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Wed Feb 02, 2011 9:45 pm Post subject: Reply with quote
back2newbelf wrote:
gabefarkas wrote:
Are you resampling with replacement, or without?
With replacement. Why do you ask?
That's how bootstrapping is supposed to be done, from what I remember. I initially thought maybe that was the issue you were facing, but I guess not.
Back to top
View user's profile Send private message Send e-mail AIM Address
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Wed Feb 09, 2011 7:14 am Post subject: Reply with quote
I did a test on how many years one should use to get best prediction results.
I split this seasons' data into several (N) parts, computed player values on N-1 parts N times, always leaving out just one part. Then, using the computed player values, computed error on the part that was left out (N times, because N parts were left out).
Then I did the same thing but included data from seasons prior. All of this older data is used to compute player values, combined with the parts from this running season, always removing one part from this running season as described above
If I use just this season the error on out-of-sample-2010/2011-data is bigger than if I include 2009/2010. Including 2008/2009 on top of 09/10 improves the error even more and it's actually best when I include 07/08 too. From here on it always gets worse when I include older data.
From best to worst:
3.x year
4.x year
2.x year
5.x year
1.x year
0.x year
_________________
http://stats-for-the-nba.appspot.com/
PostPosted: Wed Dec 15, 2010 3:50 pm Post subject: Reply with quote
back2newbelf wrote:
This season only:
http://www.docdroid.net/5ka/2011.xls.html
Format: Offense(100 possessions)|Defense(100 possessions)|Sum
That early in the season it is obviously of limited use and will produce funny results
So far we have:
offensive player of the year: Hedo Turkoglu (he is also the worst defender though)
defensive player of the year: Darrell Arthur
While it kind of agrees with the media on the MVP race, Nowitzki/Garnett/Ginobli/super-friends all look good, it couldn't disagree more on the Rookie-of-the-year-race, putting Jeff Adrian, Landry Fields and Evan Turner at the top. John Wall is supposed to be 9th worst of all players, Griffin 7th worst
From looking at the top rated players one would think the best basketball age is 35
Also, Shane Battier is suddenly listed as a horrible defender and Chuck Hayes, who used to rock this rating, is the 10th worst player in the league
I haven't figured out how to do this, but would it be possible to use ASPM ratings as a Bayesian prior?
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mathayus
PostPosted: Sun Dec 26, 2010 4:16 pm Post subject: Reply with quote
back2newbelf wrote:
While it kind of agrees with the media on the MVP race, Nowitzki/Garnett/Ginobli/super-friends all look good, it couldn't disagree more on the Rookie-of-the-year-race, putting Jeff Adrian, Landry Fields and Evan Turner at the top. John Wall is supposed to be 9th worst of all players, Griffin 7th worst
In my experience, star rookies very rarely look impressive by +/- metrics. This has led me to conclude that if we truly gave the ROY to the MVP of rookies, in most years it would go to a player who happened to fill a niche on a successful team instead of the big name rookies.
While there would be nothing inherently wrong with that, the big name rookies are actually the ones who tend to go on and become stars by +/- metrics, so focusing on the volume statistics instead of +/- statistics for rookies does serve a useful purpose.
Back to top
View user's profile Send private message
back2newbelf
PostPosted: Sun Dec 26, 2010 6:08 pm Post subject: Reply with quote
mathayus wrote:
In my experience, star rookies very rarely look impressive by +/- metrics. This has led me to conclude that if we truly gave the ROY to the MVP of rookies, in most years it would go to a player who happened to fill a niche on a successful team instead of the big name rookies.
While there would be nothing inherently wrong with that, the big name rookies are actually the ones who tend to go on and become stars by +/- metrics, so focusing on the volume statistics instead of +/- statistics for rookies does serve a useful purpose.
Good point.
I think what also needs to be done is to never use single-season (R)APM to judge rookies. The top rookies will usually play heavy minutes on very bad teams. When RAPM "doesn't know" that the players the rookie is currently playing with already sucked the year before it puts part of the blame on him. This can be avoided with multi-season (R)APM
Back to top
View user's profile Send private message
back2newbelf
PostPosted: Tue Jan 18, 2011 12:01 pm Post subject: Reply with quote
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Back to top
View user's profile Send private message
DSMok1
PostPosted: Tue Jan 18, 2011 12:17 pm Post subject: Reply with quote
back2newbelf wrote:
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Thanks a lot for this data!
Could you please post the standard errors for each estimate as well? The lack of standard errors makes it very difficult to use this data for additional research!
I find it interesting and expected that the Lambdas broke down the way they did: players regress far more to the mean on defense, as that is a more unstable measure, while coaches have more of an impact on defense than offense.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
deepak
PostPosted: Tue Jan 18, 2011 10:44 pm Post subject: Reply with quote
back2newbelf wrote:
I think we have the error fixed that made lambda so huge. At least it looks more sane now, being around ~2500 for a single season.
Single season approximated RAPM now gets published on http://stats-for-the-nba.appspot.com/ and probably updated every two weeks or so.
The site also contains data from the latest multiyear analysis which included coaches and tried different lambdas for offense and defense for both players and coaches. They were found to be: Offense: 2500, Defense: 7500, Offense(Coach): 6000, Defense(Coach): 4500.
Unfortunately the difference in error on the test sets between this method and using players only with just one lamdba is minimal.
Appreciate it.
I got the following correlation table between your current season RAPM and some various per-minute boxscore statistics:
Code:
Age MPG GmSc USG ORB DRB PPR BLK+STL PTS OFF DEF
OFF 0.122 0.405 0.530 0.206 -0.049 0.039 0.243 -0.005 0.371 1.000 -0.009
DEF 0.139 -0.074 -0.025 -0.152 0.031 0.132 -0.017 0.171 -0.117 -0.009 1.000
all boxscore stats are per 40 minutes
GmSc = PTS + 0.4 * FG - 0.7 * FGA - 0.4*(FTA - FT) + 0.7 * ORB + 0.3 * DRB +
STL + 0.7 * AST + 0.7 * BLK - 0.4 * PF - TOV
USG = FGA + 0.44*FTA + TOV
PPR = 0.7*AST - TOV
Question: Are coaches overly biased towards offensive players, or does RAPM overrate the value of defensive players?
page 2
Author Message
back2newbelf
PostPosted: Wed Jan 19, 2011 1:40 pm Post subject: Reply with quote
deepak wrote:
cool table! Could you do it for GmSc without DRebs, steals and blocks?
Back to top
View user's profile Send private message
Ilardi
PostPosted: Wed Jan 19, 2011 5:44 pm Post subject: Reply with quote
back2newbelf wrote:
deepak wrote:
cool table! Could you do it for GmSc without DRebs, steals and blocks?
And PER?
Back to top
View user's profile Send private message
Crow
PostPosted: Wed Jan 19, 2011 5:48 pm Post subject: Reply with quote
Deepak,
I'd also be interested in seeing the correlations for offensive and defensive splits of Game Score.
And seeing how far you could possibly up the correlation with multi-season Adjusted +/- by optimizing these linear weights, with an additional variable for capturing the residual uncaptured in Game Score shot defense. What is the average absolute value of that shot defense variable?
Adjusted +/- is not perfect, it is an estimate with error. But an optimized to max correlation with Adjusted +/- Game Score could be worth seeing as an intermediate product between the existing linear weight boxscore based metric and Adjusted +/-. The weights may or may not be stable thru different periods but it would be suggestive about whether the linear weights should be changed to try to get closer over the long-run (and maybe also where the Adjusted +/- errors might be higher?). While I am suggesting here doing it with the simple Game Score, ideally such a comparison would be done with newer / probably better boxscore (and play by play) based metrics.
Back2newbelf,
Might your friend be interested in doing a RAPM run that just looked at when the top 8-16 teams play in regular season games against each other? I think it would be useful to see where values from that split vary considerably from the league-wide RAPM.
Any interest in say a 3 season playoffs only run? Or a run where the playoff data was included with regular season data but had a somewhat higher or significantly higher weight?
Or how about preparing up to date multi-season Adjusted +/- splits down to the 4 Factor level?
Back to top
View user's profile Send private message
acollard
PostPosted: Wed Jan 19, 2011 7:27 pm Post subject: Reply with quote
Crow wrote:
Adjusted +/- is not perfect, it is an estimate with error. But an optimized to max correlation with Adjusted +/- Game Score could be worth seeing as an intermediate product between the existing linear weight boxscore based metric and Adjusted +/-. The weights may or may not be stable thru different periods but it would be suggestive about whether the linear weights should be changed to try to get closer over the long-run (and maybe also where the Adjusted +/- errors might be higher?). While I am suggesting here doing it with the simple Game Score, ideally such a comparison would be done with newer / probably better boxscore (and play by play) based metrics.
This is a pretty similar suggestion to DSMOK1's suggestion of using ASPM as a Bayesian prior, and both seem pretty great. I think you could get a more consistent result if you used a statistical approach as a jumping off point for Adjusted +/-.
On the other hand, one of the coolest and best things about Adjusted +/- is the guys who stand out for seemingly unclear reasons, without a lot of box score stats to back it up. Sometimes, its garbage, but other times, it can point toward some surprising truth. This quality would be largely diminished if you somehow averaged or weighted Adjusted +/- with a statistical approach.
I'm also unsure how I feel about using coaches in the +/- formula. Does it make anyone else uneasy? I feel like there just isn't enough data with most or all of them for it to be useful. What's the highest number of coaches a team has had in the past five years? 3? And what about the most amount of teams a coach has coached with? 2? It seems like the coach variable as its used may have a lot of other things about the team he coaches that correlate with him like team environment, chemistry, medical staff, player synergies, home crowd, etc. that may or may not have a lot to do with the coach himself.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Wed Jan 19, 2011 8:18 pm Post subject: Reply with quote
Any form of metric with Adjusted +/- weighted with or informed by "a statistical approach" could and I think should be a 3rd leg to the 2 "pure" approaches / metrics. You don't have to dispose of the originals and I wouldn't.
I'd glad to see the data with coaches at least once, though I'd probably use without coaches more given the concerns you raised. Having both would, again, allow the comparisons and help spot aberrant / potentially interesting stuff.
It also would be great to have, in public on a regular basis, player pair Adjusted +/-. Comparing the individual ratings with the pair would be suggestive about specific player to player interactions / impacts.
Conceivably you could have player / coach Adjusted +/- pairs too.
The errors might be too high for many to make much of player / opposing player or opposing coach pairs except maybe in a 4-6 season version, but it is also conceivable. Again, if you are searching for interesting things, it might be fun to at least see and maybe more than that.
Last edited by Crow on Thu Jan 20, 2011 2:18 pm; edited 1 time in total
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Thu Jan 20, 2011 6:13 am Post subject: Reply with quote
Crow wrote:
Or how about preparing up to date multi-season Adjusted +/- splits down to the 4 Factor level?
I want to do this sometime with TS%, OReb% and Tov%
Quote:
It also would be great to have in public on a regular basis would be player pair Adjusted +/-. Comparing the individual ratings with the pair would be suggestive about specific player to player interactions / impacts.
Conceivably you could have player / coach Adjusted +/- pairs too.
This is also on my to-do-list
allocard wrote:
I'm also unsure how I feel about using coaches in the +/- formula. Does it make anyone else uneasy? I feel like there just isn't enough data with most or all of them for it to be useful. What's the highest number of coaches a team has had in the past five years? 3? And what about the most amount of teams a coach has coached with? 2?
Player trades help here. A coach might have just coached two teams but there's a good possibility he coached 40+ players.
One big problem with including coaches is probably aging. Kuester has to work with an older (and probably worse) Ben Wallace and Hamilton than coaches before him, but the algorithm thinks they're the same player and punishes Kuester for it
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Thu Jan 20, 2011 2:21 pm Post subject: Reply with quote
Appreciate the shared public data so far and look forward to the additional variations and extensions of Adjusted +/- that you and your friend decide to prepare.
Back to top
View user's profile Send private message
acollard
Joined: 22 Sep 2010
Posts: 49
Location: MA
PostPosted: Thu Jan 20, 2011 5:54 pm Post subject: Reply with quote
[quote=back2newbelf]One big problem with including coaches is probably aging. Kuester has to work with an older (and probably worse) Ben Wallace and Hamilton than coaches before him, but the algorithm thinks they're the same player and punishes Kuester for it[/quote]
Isn't this the same problem for player adjusted +/- over long periods? Players who play with aging superstars or soon to be superstars are devalued, and perhaps players who played with a now declining superstar who was still in his prime are overvalued?
I wonder if adjusted +/- would or could be able to use aging curves in any way, to help avoid errors like this? It seems like it could be useful.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Fri Jan 21, 2011 6:50 am Post subject: Reply with quote
Updated this years' ranking and added 2 year ranking http://stats-for-the-nba.appspot.com/2-year-ranking
George Hill looks suprisingly good in the one-year-ranking. Going by 82games, he does have a good On/Off rating and, sorted by minutes, 2 of his top 3 5-man-units do not involve Ginobili who has the Spurs' best On/Off
acollard wrote:
Isn't this the same problem for player adjusted +/- over long periods? Players who play with aging superstars or soon to be superstars are devalued, and perhaps players who played with a now declining superstar who was still in his prime are overvalued?
yes that's true
Quote:
I wonder if adjusted +/- would or could be able to use aging curves in any way, to help avoid errors like this? It seems like it could be useful.
I'm sure it's possible and I will probably do this sometime when I'm less busy
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 547
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jan 21, 2011 7:03 am Post subject: Reply with quote
You'd probably have to generate the aging curves ahead of time, apply them in a "preprocessing" phase, and then run the regression. I generated a fairly good aging curve for ASPM, which should look the same as APM. It's at http://sonicscentral.com/apbrmetrics/vi ... php?t=2652 .
_________________
GodismyJudgeOK.com/DStats
Back to top
View user's profile Send private message Visit poster's website
EvanZ
Joined: 22 Nov 2010
Posts: 188
PostPosted: Fri Jan 21, 2011 8:38 am Post subject: Reply with quote
I put the 2yr data up as a .csv file on GoogleDocs. Added rank as the first column:
https://spreadsheets.google.com/pub?key ... output=csv
_________________
http://www.thecity2.com
http://www.ibb.gatech.edu/evan-zamir
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Fri Jan 21, 2011 3:25 pm Post subject: Reply with quote
Thanks for the 2 season RAPM.
It would be handy to have the team identifiers in the file too for sorting, though users can cobble it together.
I can't recall for sure if Joe Sill used age curves in his RAPM. I think he might have. I also think I recall Steve Ilardi talking about doing it in his next private version of Adjusted +/-.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 746
PostPosted: Fri Jan 21, 2011 4:37 pm Post subject: Reply with quote
I matched up this 2 season RAPM with basketballvalue's 2 season traditional APM for 328 players with values in both. The r2 was .56, lower than I'd hoped to see.
I also look at the r2 from just guys +4 or better and it was .25. Between +4 and -4, it was .18. Less than -4, .04.
What do you make of this?
What should be done?
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 236
PostPosted: Fri Jan 21, 2011 5:02 pm Post subject: Reply with quote
Crow wrote:
What should be done?
About what? It's not exactly my goal to produce numbers that correlate well with traditional APM
Back to top
View user's profile Send private message
Ilardi
Joined: 15 May 2008
Posts: 257
Location: Lawrence, KS
PostPosted: Fri Jan 21, 2011 5:47 pm Post subject: Reply with quote
Crow wrote:
I matched up this 2 season RAPM with basketballvalue's 2 season traditional APM for 328 players with values in both. The r2 was .56, lower than I'd hoped to see.
I also look at the r2 from just guys +4 or better and it was .25. Between +4 and -4, it was .18. Less than -4, .04.
What do you make of this?
What should be done?
Crow, your R^2 of .56 implies a zero-order correlation (r) between RAPM and APM of .75, with both estimates derived from only 1.5 years of data. That's surprisingly high, imho.
The lower R^2 numbers at higher/lower values of APM is most likely a "truncated range" phenomenon.
page 3
Author Message
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Fri Jan 21, 2011 5:48 pm Post subject: Reply with quote
back2newbelf,
I accept that it is not exactly your goal to produce numbers that correlate well with traditional APM. That is not the goal, the goal is estimating true impact.
Nonetheless I wanted to ask a few simple open-ended questions to possibly hear further thoughts on the metric comparison from whomever wanted to share them.
Currently I'll look at and weigh the estimates from 2 year or longer traditional APM or RAPM and preferably RAPM.
That these versions can vary a fair amount is something I've been noting case by case for awhile when I find it. It is to be expected to a degree but I do think there was value to checking the correlation.
If an even better version of APM can be constructed with comparison and discussion, I'd say great.
Steve,
Yes the r was .747. I reported the r2 because I had gotten the impression that was preferred. Maybe I have some previous r's and r2's reported for metric comparisons scrambled in my mind but I was under the impression that an r2 of .56 was pretty good but not real strong and since these metrics are at the core the same type method I thought it might be higher. Maybe my expectations were too high and I will consider your reaction. Perhaps the correlation would be even higher for a longer time period as you suggest or if other authored versions of traditional APM or RAPM are used. I don't fully understand the lamba value issue but that may be part of it as is the minute cutoff choice.
Just noting what I see since I don't recall a recent comparison of traditional and RAPM values, especially at the 2 year level, and any comparative discussion at Joe's site is gone. Maybe there is some dialog here that could be dug up. But there probably is still room for it to continue.
I was thinking it might not be surprising for the truncated ranges to have lower correlations for the segments but thanks for the reinforcement. I am not surprised that the correlations were stronger in the top segment than the middle or the bottom, but I thought that might be worthing noting too.
Back to top
View user's profile Send private message
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Sat Jan 22, 2011 3:09 pm Post subject: Reply with quote
Players whose 1.5 season traditional APM is 5 or more points higher than this RAPM
Fields, Landry 11.5
Collins, Jason 10.1
Nash, Steve 9.6
Aldridge, LaMarcus 9.4
Dooling, Keyon 8.2
Bass, Brandon 8.2
Nowitzki, Dirk 8.0
Wallace, Gerald 8.0
Gasol, Pau 7.3
Rose, Derrick 7.1
Johnson, Amir 7.0
West, David 6.8
Lopez, Brook 6.4
Fesenko, Kyrylo 6.2
Favors, Derrick 5.9
Chandler, Tyson 5.9
Dunleavy, Mike 5.8
Carter, Vince 5.7
Paul, Chris 5.4
Dorsey, Joey 5.4
Brockman, Jon 5.3
Johnson, Wesley 5.2
Young, Thaddeus 5.1
Gordon, Ben 5.0
Players whose 1.5 season traditional APM is 5 or more points lower than this RAPM
Belinelli, Marco -5.0
Cousins, DeMarcus -5.0
Jones, Solomon -5.0
Wall, John -5.1
Landry, Carl -5.2
Forbes, Gary -5.2
Bayless, Jerryd -5.3
Vasquez, Greivis -5.3
Nocioni, Andres -5.3
Jack, Jarrett -5.4
Jackson, Stephen -5.6
Outlaw, Travis -5.6
Krstic, Nenad -5.8
Williams, Terrence -5.8
Andersen, Chris -5.8
Splitter, Tiago -5.9
Livingston, Shaun -6.0
Boykins, Earl -6.1
Collison, Darren -6.2
Ridnour, Luke -6.3
Matthews, Wes -6.3
Marion, Shawn -6.3
Moon, Jamario -6.3
Chalmers, Mario -6.4
Williams, Jawad -6.4
Milicic, Darko -6.5
Graham, Stephen -6.5
Bledsoe, Eric -7.1
Carney, Rodney -7.2
Maggette, Corey -7.2
Sanders, Larry -7.3
Armstrong, Hilton -7.5
Bryant, Kobe -7.5
Dragic, Goran -7.6
Evans, Maurice -7.6
Webster, Martell -7.7
Powell, Josh -7.7
Bell, Raja -7.8
McGuire, Dominic -7.8
Gortat, Marcin -7.9
Telfair, Sebastian -7.9
Douglas-Roberts, Chris -8.2
Ellington, Wayne -8.5
Hill, Jordan -8.5
Richardson, Jason -8.8
Mason, Roger -9.9
Erden, Semih -9.9
Brown, Kwame -10.0
Law, Acie -10.1
Arroyo, Carlos -10.4
Monroe, Greg -11.1
Batum, Nicolas -11.4
Felton, Raymond -11.8
Thabeet, Hasheem -11.8
House, Eddie -12.1
Hayward, Gordon -12.4
About 25% of the players compared are in one of the 2 groups, so 75% of the estimates are within 5 points of each other. More than 50% were within 3 points of each other. About 40% within 2 points.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Sun Jan 23, 2011 7:38 am Post subject: Reply with quote
What's not accounted for in adjusted +/-?
Forcing good opponent players to the bench because he just fouled you!
I split all players into 3 groups according to their 2-year rating: [>+1.0, +1.0> & >-1.0, <-1.0], then used basketballgeek.com's 2009/2010 data to compute how many times a player was fouled by players of the different groups
http://stats-for-the-nba.appspot.com/fouling
minimum 50 possessions.
The analysis is far from perfect. One problem is that garbage time players can, for the most part, only be fouled by other garbage time players. Thus they will never look good in the "being fouled by >+1.0" category.
One other problem is that all fouls get treated the same, when in reality it's probably better to make someone pick up his second foul in the 1st quarter, rather than make him pick up his fourth foul with 10 seconds to play in the game
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jan 24, 2011 4:42 pm Post subject: Reply with quote
Would it be possible to do this at a lineup level?
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Mon Jan 24, 2011 5:10 pm Post subject: Reply with quote
DSMok1 wrote:
Would it be possible to do this at a lineup level?
You need to be a little more clear. Are you talking about fouling? Do you want the lineups that get fouled by certain players, or the players that get fouled by certain lineups? Something else?
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Mon Jan 24, 2011 5:32 pm Post subject: Reply with quote
back2newbelf wrote:
DSMok1 wrote:
Would it be possible to do this at a lineup level?
You need to be a little more clear. Are you talking about fouling? Do you want the lineups that get fouled by certain players, or the players that get fouled by certain lineups? Something else?
No, sorry, I meant the RAPM at the lineup level, like Basketball Value does, but with the lambda-based ridge regression applied.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Tue Jan 25, 2011 11:53 am Post subject: Reply with quote
Now with Euroleague approximated RAPM at http://stats-for-the-nba.appspot.com/euroleague-ranking Last season and this season combined. Optimal lambda was, again, 3000.
Rubio looks pretty good
Thanks to http://www.in-the-game.org for providing the data
DSMok1 wrote:
No, sorry, I meant the RAPM at the lineup level, like Basketball Value does, but with the lambda-based ridge regression applied.
Certainly possible, but not exactly at the top of my to-do list (that would be adj. four factors and player pairs)
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Tue Jan 25, 2011 12:10 pm Post subject: Reply with quote
Excellent once again! I'm sure we're getting repetitive saying that over and over...
Very Happy
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Crow
Joined: 20 Jan 2009
Posts: 821
PostPosted: Tue Jan 25, 2011 12:41 pm Post subject: Reply with quote
Rubio unimpressive to me on individual statistical measurement (70th Ranking at hoopsstats) but this RAPM has him at +1.9, 11th best in Euroleague.
He is helping optimize his teammates further on offense and also contributing a bit on defense.
Would he optimize NBA teammates on offense at the same level or more or less? I'd think his defensive impact would be less in the NBA than Euroleague but that is a surface reaction. Not that the test is coming that soon or will be that important but I wanted to touch on it given recent articles.
If RAPM were done for additional earlier Euroleague seasons then some Euroleague RAPM to NBA RAPM comparisons could be done now for guys who came over here. I guess it could be done the other way too. Not sure how much I'd weight general translation projections from one league to another that heavily in a specific player's evaluation but would be good to see the averages and to gather as many examples as possible. See what it says, what you think it says and what results you (and others) get with one approach over time or another.
Back to top
View user's profile Send private message
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Fri Jan 28, 2011 5:43 am Post subject: Reply with quote
I think I found a way to compute standard errors via bootstrapping. Not 100% sure if this is correct though.
Bootstrap sample: From our n observations take n independent draws with replacement.
Then use a Monte Carlo algorithm:
(1)using a random number generator, independently draw a large number of bootstrap samples (B)
(2)for each bootstrap sample evaluate the statistic of interest
(3)calculate the standard deviation of the values
Right now it's only available for the 2-year ranking http://stats-for-the-nba.appspot.com/2-year-ranking
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
DSMok1
Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains
PostPosted: Fri Jan 28, 2011 10:27 am Post subject: Reply with quote
I don't think it's working right... Sad
The standard errors should be highest on the players with the least data... but it is reversed here. The players that have the least data have their results dominated by the lambda, and thus return a low stdev on the bootstrapping.
What should happen is that the players with essentially no data should have a standard error equal to the standard deviation of the over-all distribution of NBA players, or, I should say, it's based on the lambda. It's a Bayesian deal: the prior is 0, and the lambda defines the spread (I'm not sure how to convert lambda to standard deviation). Then the player data is applied, narrowing the standard error of the estimate.
That's about all I know... Oh, the errors should probably be in the range of 1.5 for the best-known player ranging up to like 6 or 7 for players with no data.
_________________
GodismyJudgeOK.com/DStats
Twitter.com/DSMok1
Back to top
View user's profile Send private message Send e-mail Visit poster's website
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Tue Feb 01, 2011 2:12 pm Post subject: Reply with quote
Are you resampling with replacement, or without?
Back to top
View user's profile Send private message Send e-mail AIM Address
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Wed Feb 02, 2011 3:43 pm Post subject: Reply with quote
gabefarkas wrote:
Are you resampling with replacement, or without?
With replacement. Why do you ask?
_________________
http://stats-for-the-nba.appspot.com/
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Wed Feb 02, 2011 9:45 pm Post subject: Reply with quote
back2newbelf wrote:
gabefarkas wrote:
Are you resampling with replacement, or without?
With replacement. Why do you ask?
That's how bootstrapping is supposed to be done, from what I remember. I initially thought maybe that was the issue you were facing, but I guess not.
Back to top
View user's profile Send private message Send e-mail AIM Address
back2newbelf
Joined: 21 Jun 2005
Posts: 274
PostPosted: Wed Feb 09, 2011 7:14 am Post subject: Reply with quote
I did a test on how many years one should use to get best prediction results.
I split this seasons' data into several (N) parts, computed player values on N-1 parts N times, always leaving out just one part. Then, using the computed player values, computed error on the part that was left out (N times, because N parts were left out).
Then I did the same thing but included data from seasons prior. All of this older data is used to compute player values, combined with the parts from this running season, always removing one part from this running season as described above
If I use just this season the error on out-of-sample-2010/2011-data is bigger than if I include 2009/2010. Including 2008/2009 on top of 09/10 improves the error even more and it's actually best when I include 07/08 too. From here on it always gets worse when I include older data.
From best to worst:
3.x year
4.x year
2.x year
5.x year
1.x year
0.x year
_________________
http://stats-for-the-nba.appspot.com/