APBRmetrics

The discussion of the analysis of basketball through objective evidence, especially basketball statistics.
It is currently Mon Sep 25, 2017 6:44 pm

All times are UTC




Post new topic Reply to topic  [ 88 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Thu Nov 15, 2012 6:13 pm 
Offline

Joined: Fri Apr 15, 2011 8:28 am
Posts: 802
This new metric consisting of recursively informed RAPM+BoxScoreRating is now on my website as "xRAPM" and has replaced RAPM

There are full lists of players for each year, like here http://stats-for-the-nba.appspot.com/ratings/2007.html
and team pages with player ratings of '13, like this http://stats-for-the-nba.appspot.com/teams/OKC.html

_________________
http://stats-for-the-nba.appspot.com/


Top
 Profile  
 
PostPosted: Thu Nov 15, 2012 7:20 pm 
Offline

Joined: Thu Apr 14, 2011 11:18 pm
Posts: 813
Location: Maine
Great!

Have you yet applied an aging curve to the priors for each player between seasons? That's critical in my opinion.

_________________
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 1:26 am 
Offline

Joined: Fri Nov 16, 2012 12:57 am
Posts: 5
J.E. wrote:
This new metric consisting of recursively informed RAPM+BoxScoreRating is now on my website as "xRAPM" and has replaced RAPM

There are full lists of players for each year, like here http://stats-for-the-nba.appspot.com/ratings/2007.html
and team pages with player ratings of '13, like this http://stats-for-the-nba.appspot.com/teams/OKC.html


JE, seems like I"m the only downer here, which made me decide to re-register.

I guess first and foremost: Can you please keep RAPM data available on your site in addition to the xRAPM data?

I understand that you are focused on using RAPM-based techniques to make for a single stat that gets closer to the holy grail, and I'm not against that. However, for those of us who are focused primarily in using various tools together to analyze a situation, the beauty of +/- data is in its unbiased orthogonality relative to the box score. So from my perspective right now, the site with the single best source of non-box score biased based data on the internet has gone and tossed that out and replaced it with something that really isn't the same thing.

If people here believe I'm not understand the situation properly, please set me straight. Clearly though, box scores are being used, and now all of a sudden I see guys like McGrady & Paul who were loved by PER take a leap forward. Hard to imagine this is a coincidence.


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 11:36 am 
Offline

Joined: Sat Oct 27, 2012 12:30 pm
Posts: 408
I feel like pointing out that RAPM is biased by design, so whilst it may not suffer the same lack of completeness as the box-score it is biased in another way. I think the aim should be to minimize our errors in explaining future outcomes, and not worry so much about the 'purity' of our results. A good hack is a very valuable thing in stats, it can reveal a lot. However I can understand you wanting RAPM as well as xRAPM.

And obviously, whilst +/- is informationally complete it is far from accurate.


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 12:14 pm 
Offline

Joined: Mon Apr 18, 2011 10:09 am
Posts: 470
J.E. wrote:
Here are the weights I found for offense and defense. Everything's scaled to "influence of height on offense"


Looking through your coefficients makes me wonder how "useful" they are. I have a big issue with results, which are implying that a FGA is more harmful than a turnover, because that is per se not true. The rules of the game are forcing teams to take a shot within 24 seconds, if they are not taking that shot, they are penalized with a turnover. According to your results, it is better to throw the ball away/let the shot clock expire than taking a difficult shot with a success-rate below 29%.

Also, converting your FT with a higher FT% is something to be considered negative, especially on defense? I have the impression that some of the variables you picked are statistically not significant.

I also agree with johannesdesilentio, that having pure RAPM would be useful.

_________________
http://bbmetrics.wordpress.com/


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 12:28 pm 
Offline

Joined: Fri Apr 15, 2011 8:28 am
Posts: 802
Quote:
Have you yet applied an aging curve to the priors for each player between seasons? That's critical in my opinion.
I will definitely add that to the site at some point, together with coach rating. Although I do believe that by including BoxScore stats the problem of not having a aging curve built in got a little smaller (But I'm not saying an aging curve isn't necessary anymore)

Quote:
I guess first and foremost: Can you please keep RAPM data available on your site in addition to the xRAPM data?

I understand that you are focused on using RAPM-based techniques to make for a single stat that gets closer to the holy grail, and I'm not against that. However, for those of us who are focused primarily in using various tools together to analyze a situation, the beauty of +/- data is in its unbiased orthogonality relative to the box score. So from my perspective right now, the site with the single best source of non-box score biased based data on the internet has gone and tossed that out and replaced it with something that really isn't the same thing.

If people here believe I'm not understand the situation properly, please set me straight. Clearly though, box scores are being used, and now all of a sudden I see guys like McGrady & Paul who were loved by PER take a leap forward. Hard to imagine this is a coincidence.
My main goal is to build a metric that is best at forecasting future offensive efficiency; It is my belief that this is equivalent with building a metric that gives the most accurate player ratings. My goal is not to build a metric that further needs to be combined with other metrics. Especially because I don't really believe in most other metrics except ASPM, ORtg/DRtg and LambdaPm (which is very similar to xRAPM).
As was already mentioned, RAPM was already biased. And, it was biased in direction of a worse/less accurate prior. The ratings were less accurate, and thus unfair to certain players. If some players took a leap forward through xRAPM it most likely means they were unfairly underrated in RiRAPM, and in turn, their teammates were overrated. Further, all ratings are estimates and never represent actual truth. I'm just trying to improve those estimates. If you want hard-fact +/- stats you should probably look at simple +/- and On/Off.

xRAPM just continues with the thought of improving out-of-sample prediction, which (probably) was the reason for RAPM being built as an replacement of APM

Also, please realize that an at least top 3 BoxScore metric is helping with building the priors, and that estimates given by RiRAPM were estimates that were most often further away from the truth compared to xRAPM estimates

Quote:
Also, converting your FT with a higher FT% is something to be considered negative, especially on defense?
I'd have no problem with saying that shooting FTs at a bad % is certainly an indicator for being a good defender.
Quote:
I have the impression that some of the variables you picked are statistically not significant.
I've taken measures to avoid overfitting. Those coefficients that might not be significant should at least be close to 0

_________________
http://stats-for-the-nba.appspot.com/


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 12:39 pm 
Offline

Joined: Mon Apr 18, 2011 10:09 am
Posts: 470
J.E. wrote:
I'd have no problem with saying that shooting FTs at a bad % is certainly an indicator for being a good defender.


Well, your coefficients are actually saying the opposite overall. That wasn't really the point, just that I think FT% is redundant, when you have FT and FTA in it as well.

J.E. wrote:
I've taken measures to avoid overfitting. Those coefficients that might not be significant should at least be close to 0


I don't think that this is true at all. I can probably present you multiple examples of regressions in which a non-significant variable had actually a value substantially higher than 0. No idea what kind of "measures to avoid overfitting" were used, but looking through your picked variables and the coefficients they are rather close to the coefficients you would get, if you ran a regression on team overall boxscore totals. And when I do that, I have multiple variables being not significant for offense and others for defense while having a bigger coefficient than other significant variables.

Including the height is a good thing, but I would guess (at least that is my result) that height over average height for position gives a better prediction. 6'9 is bigger than league average, but 6'9 as a center is actually below average for that position. But that would mean that you would need to assign each player with a certain position, which can be a tough job, especially for those who are essentially playing different position on offense and defense.

_________________
http://bbmetrics.wordpress.com/


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 1:51 pm 
Offline

Joined: Fri Apr 15, 2011 12:02 am
Posts: 3828
Location: Asheville, NC
mystic wrote:
... implying that a FGA is more harmful than a turnover, because that is per se not true. ..

You'll always get counter-intuitive correlations when you double-count FG and FGA; or triple-count, with FG%.

Using discrete events should always be preferred: FG made and FG missed, for example.
Don't worry about the fact that FGX aren't explicitly in the box score. It's just FGA - FG, y'know.


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 2:04 pm 
Offline

Joined: Mon Apr 18, 2011 10:09 am
Posts: 470
Mike, I agree that using exclusive events like 2pt field goals made, 2pt fg missed, 3pt fg made, etc. is the way to go. But not using that is not the issue which causes the "counter-intuitive" results. It is also not just "counter-intuitive" that turnover are more harmful than even a field goal missed, it is simply a fact. The results are not showing that, because players, coaches, etc. are aware of the shotclock, thus we don't have an unbiased sample anymore. If nobody would know about the shotclock while it still exists, we would have more turnover events, while less "bad" shots. In that case the results of the regression would shift. It is just the rules of the game, which are forcing players to take late bad shots on purpose, while nobody is actually letting the shot clock expire on purpose. The latter is what the results of the regression is implying to be the better strategy in some cases.

_________________
http://bbmetrics.wordpress.com/


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 2:44 pm 
Offline

Joined: Thu Apr 21, 2011 8:25 pm
Posts: 218
Location: Boone, NC
WOWSA. This is a game-changer.


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 2:57 pm 
Offline

Joined: Thu Apr 21, 2011 8:25 pm
Posts: 218
Location: Boone, NC
Re: Mystic.

It is simply impossible to pull out two variables from a multivariable regression and compare them without taking into context the other variables.

In fact, even looking at one variable and saying it "costs x" is a fallacy. Statistics correlate with other statistics.

The easiest proof of this is Turnovers. If your only input in a regression against +/- was Turnovers and Rebounds, for example, Turnovers would likely be positive! This is because players with higher turnover rate tend to score more points and assist the ball more. It doesn't mean that a player who turns the ball over 10 times per 100 possessions is necessarily better than one who does it at 5 times per 100 possessions. It just means that while we are holding other things equal, players with a tendency to turn it over also tend to be more active in helping put points on the board.

This is also precisely why Wins Produced doesn't make any sense. :)


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 4:17 pm 
Offline

Joined: Thu Apr 14, 2011 11:10 pm
Posts: 4605
FWIW, Nick Collison is not super-elite on this measure as he was on other versions of APM and RAPM. In the +2 range in the 2010 and 2011 seasons, in the +1 range in 2012 and 2013.


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 4:26 pm 
Offline

Joined: Mon Apr 18, 2011 10:09 am
Posts: 470
bbstats wrote:
The easiest proof of this is Turnovers. If your only input in a regression against +/- was Turnovers and Rebounds, for example, Turnovers would likely be positive!


Have you tested that? Well, likely not, because otherwise you wouldn't have brought up that example.

Correlation between PTS, AST and TO-R:

Code:
      Correlations
      TO-R   PTS   AST
Pearson Correlation   TO-R   1,000   -,262   -,052
   PTS   -,262   1,000   ,237
   AST    -,052   ,237   1,000
Sig. (1-tailed)   TO-R   .   ,000   ,136
   PTS   ,000   .   ,000
   AST    ,136   ,000   .
N   TO-R   452   452   452
   PTS   452   452   452
   AST    452   452   452



There is even a negative correlation between points and turnover-rate as well as a negative between assists and turnover-rate. (Data from 2010/11 season)

Also, you are wrong about the rest as well, because I compare the coefficients within the context. ;)

bbstats wrote:
This is also precisely why Wins Produced doesn't make any sense. :)


That is also wrong. WP doesn't make sense, because it is using team formulas to calculate the marginal values while at the same time handling the game, as if that would be 5 distinct different 1on1 games rather than one 5on5 game.

In the sense Jerry is falling into a similar trap here. That's why a turnover comes out as being more harmful than FGA, while the chances to score in a possession for a team after missed field goal is higher than 0 while it is exactly 0, if a turnover occurs.

_________________
http://bbmetrics.wordpress.com/


Top
 Profile  
 
PostPosted: Fri Nov 16, 2012 11:58 pm 
Offline

Joined: Thu Oct 18, 2012 2:44 pm
Posts: 10
Mystic,

It seems to me that you are drawing too many conclusions from the BOX weights. These weights, from what I can tell, do not say anything about the VALUE of a turnover or a missed field goal, and I do not believe you can extrapolate anything about coaching strategy either. It certainly would not make any sense to say that a minute played has a negative offensive value or that missing free throws will somehow turn you into Ben Wallace. Obviously a turnover is worse than a missed field goal; but, that is beside the point. The value of an event and its predictive power for the value of a player in a small range on the margins are two completely different concepts.


Top
 Profile  
 
PostPosted: Sat Nov 17, 2012 1:04 am 
Offline

Joined: Sat Oct 27, 2012 12:30 pm
Posts: 408
It's actually very easy to call into question whether a turnover should be considered as bad as a missed FGA for an individual player, whilst not for a team. Maybe often turnovers occur when there is little to no chance of a decent shot coming off, and the guy turning it over isn't close to the only reason it's being turned over, just the guy who touches it last. Now contrast that with the idea that often players take shots when passes for possibly superior shots are available, suddenly a missed FGA is worse than merely the loss of an average opportunity.

I'm not saying this is the case, only that trying to theoretically derive box-score weights is, as far as I'm concerned, a fool's errand that I gave up some time ago.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 88 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC


Who is online

Users browsing this forum: Google [Bot] and 8 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group