APBRmetrics

Posted: **Sat Nov 17, 2012 1:26 am**

J.E. wrote:My main goal is to build a metric that is best at forecasting future offensive efficiency; It is my belief that this is equivalent with building a metric that gives the most accurate player ratings. My goal is not to build a metric that further needs to be combined with other metrics. Especially because I don't really believe in most other metrics except ASPM, ORtg/DRtg and LambdaPm (which is very similar to xRAPM).
As was already mentioned, RAPM was already biased. And, it was biased in direction of a worse/less accurate prior. The ratings were less accurate, and thus unfair to certain players. If some players took a leap forward through xRAPM it most likely means they were unfairly underrated in RiRAPM, and in turn, their teammates were overrated. Further, all ratings are estimates and never represent actual truth. I'm just trying to improve those estimates. If you want hard-fact +/- stats you should probably look at simple +/- and On/Off.

xRAPM just continues with the thought of improving out-of-sample prediction, which (probably) was the reason for RAPM being built as an replacement of APM

Also, please realize that an at least top 3 BoxScore metric is helping with building the priors, and that estimates given by RiRAPM were estimates that were most often further away from the truth compared to xRAPM estimates

I understand that this is your primary goal and this is an admirable goal. I also certainly don't expect you to devote large amounts of time to something you personally don't see a lot of value in, but this is why my first statement was simply a request for you to not remove (or to un-remove) the content you had already placed up on your site. It didn't seem like this would be terribly time-consuming for you to keep that other data available. If it is just too inconvenient, then I'd certainly understand why you'd refuse.

I have to admit I'm not sure what you mean by "biased". I'd appreciate an explanation there. Are you talking about the regularization itself or something else? Which type of player-situations are talking about getting an unfair advantage or disadvantage?

Re: If you want hard-fact +/- stats. I use those other stats as well, but there's simply no doubt that the lack of precision in those stats is something cries out desperately for an improvement so I also used APM and RAPM too.

Lastly I'll just speak to your philosophy, having already said I understand why you want to do what you're trying to do:

One stat will never be enough to properly evaluate a player in basketball. No matter what you do, no matter how close you come to your statistician's stone, a player's basketball being needs to be viewed from a multi-faceted lens which let's us see many different attributes both used in conjunction and in isolation. Even if we could capture a player's actual instantaneous impact exactly, which would only be done for a single context, and much of basketball analysis is done to figure what that player will do in a novel situation.

Perhaps you or others will say "we get it, but you shouldn't have been using RAPM as part of that process in the first place", but right now it seems like you are abandoning a niche without realizing it.

Posted: **Sat Nov 17, 2012 1:33 am**

v-zero wrote:I feel like pointing out that RAPM is biased by design, so whilst it may not suffer the same lack of completeness as the box-score it is biased in another way. I think the aim should be to minimize our errors in explaining future outcomes, and not worry so much about the 'purity' of our results. A good hack is a very valuable thing in stats, it can reveal a lot. However I can understand you wanting RAPM as well as xRAPM.

And obviously, whilst +/- is informationally complete it is far from accurate.

It seems like what I'm getting from people here is that imprecision IS bias. If something measures Player X incorrectly then it is biased against him. Am I understanding you right?

That's bizarre. Would you say that a coin is biased simply because when you flip it to determine who will when the Super Bowl it's wrong half the time? Not saying imprecision isn't something to fight against, but bias in my experience means a clear systematic favoring of a certain agents over others. For example, we might say PER is biased in favor of volume scorers. What type of player gets unfair treatment as a rule by any form of +/- that doesn't use box score stats as part of its creation?

Thank you though for acknowledging that you understand why I want both types of stats here. I'm not against xRAPM, but it is not a replacement for RAPM, unless I'm truly not understanding RAPM.

Posted: **Sat Nov 17, 2012 2:04 am**

johannesdesilentio wrote:
v-zero wrote:I feel like pointing out that RAPM is biased by design, so whilst it may not suffer the same lack of completeness as the box-score it is biased in another way. I think the aim should be to minimize our errors in explaining future outcomes, and not worry so much about the 'purity' of our results. A good hack is a very valuable thing in stats, it can reveal a lot. However I can understand you wanting RAPM as well as xRAPM.

And obviously, whilst +/- is informationally complete it is far from accurate.
It seems like what I'm getting from people here is that imprecision IS bias. If something measures Player X incorrectly then it is biased against him. Am I understanding you right?

That's bizarre. Would you say that a coin is biased simply because when you flip it to determine who will when the Super Bowl it's wrong half the time? Not saying imprecision isn't something to fight against, but bias in my experience means a clear systematic favouring of a certain agents over others. For example, we might say PER is biased in favour of volume scorers. What type of player gets unfair treatment as a rule by any form of +/- that doesn't use box score stats as part of its creation?

Thank you though for acknowledging that you understand why I want both types of stats here. I'm not against xRAPM, but it is not a replacement for RAPM, unless I'm truly not understanding RAPM.

It's not that imprecision is bias, but rather that relative inaccuracy of weights causes bias. If you try to think your way to the right weights you are basically ignoring what the data is trying to tell you, and instead making assumptions that will inevitably cause bias. Ideally, given your data, you want to create the least biased, most predictive set of weights possible, and in the case of box-score data this can be done if you build a suitable model and suitable methods to find weights using out-of-sample tests.
So yes, that agrees with your definition of what bias is, and with an incomplete dataset (box-score) you can only make the best of what you have. APM on the other hand has all the data, if you like, but such a huge amount of collinearity that it can't discern between players adequately for good out-of-sample prediction. RAPM attmpts to tackle this problem with ridge regression, but depending on the chosen Tikhonov matrix there is a certain level of bias introduced in the data, so that whilst out-of-sample prediction is improved, the inherent bias means that fair player comparison becomes tricky.

By choosing improved priors (e.g. built from box-score data) the level of prediction accuracy and indeed bias can be reduced, and hence xRAPM is an improvement, in terms of prediction than RAPM. The bias still exists, but is changed and should be improved.

Posted: **Sat Nov 17, 2012 3:06 am**

v-zero wrote: RAPM attmpts to tackle this problem with ridge regression, but depending on the chosen Tikhonov matrix there is a certain level of bias introduced in the data, so that whilst out-of-sample prediction is improved, the inherent bias means that fair player comparison becomes tricky.

By choosing improved priors (e.g. built from box-score data) the level of prediction accuracy and indeed bias can be reduced, and hence xRAPM is an improvement, in terms of prediction than RAPM. The bias still exists, but is changed and should be improved.

I may be on the same page with you, or not, let me ask:

The ridge regression effectively downplays outlier data, so that if in the small amount of time a player is on the bench something ridiculous happens, we don't end up with a ridiculous rating for the player in question, correct? So if we have a Garnett in late Minny scenario, he'll get underrated by RAPM if the apparent "they can't really be THAT bad!" raw data relating to his teammates actually is a fair representative of them?

If this is what you mean, then I agree that that is a bias, but it's not the same type of bias box scores provide. Where I stand is that anywhere I see an injected bias in a particular stat, I really want a cousin stat without that bias. Similarly, wherever I see ugly imprecision in a particular stat, I really want a cousin stat with better precision. Hence I welcome xRAPM as a cousin to RAPM, but the notion that it would serve as a replacement is simply not reasonable.

Posted: **Sat Nov 17, 2012 3:13 am**

johannesdesilentio wrote:
v-zero wrote: RAPM attmpts to tackle this problem with ridge regression, but depending on the chosen Tikhonov matrix there is a certain level of bias introduced in the data, so that whilst out-of-sample prediction is improved, the inherent bias means that fair player comparison becomes tricky.

By choosing improved priors (e.g. built from box-score data) the level of prediction accuracy and indeed bias can be reduced, and hence xRAPM is an improvement, in terms of prediction than RAPM. The bias still exists, but is changed and should be improved.
I may be on the same page with you, or not, let me ask:

The ridge regression effectively downplays outlier data, so that if in the small amount of time a player is on the bench something ridiculous happens, we don't end up with a ridiculous rating for the player in question, correct? So if we have a Garnett in late Minny scenario, he'll get underrated by RAPM if the apparent "they can't really be THAT bad!" raw data relating to his teammates actually is a fair representative of them?

If this is what you mean, then I agree that that is a bias, but it's not the same type of bias box scores provide. Where I stand is that anywhere I see an injected bias in a particular stat, I really want a cousin stat without that bias. Similarly, wherever I see ugly imprecision in a particular stat, I really want a cousin stat with better precision. Hence I welcome xRAPM as a cousin to RAPM, but the notion that it would serve as a replacement is simply not reasonable.

I agree with you completely, if you buy into RAPM (I don't, for my own reasons, but I recognise that many do and that it has predictive value) then having both should be the way forward, however I would always trust a decently prior informed RAPM measure over pure RAPM for most purposes. To study the differences between the two may well be revealing though, so having both available is a must.

Posted: **Sat Nov 17, 2012 12:15 pm**

KAN wrote:These weights, from what I can tell, do not say anything about the VALUE of a turnover or a missed field goal, and I do not believe you can extrapolate anything about coaching strategy either.

First of all, a regression analysis is used to quantify the relationship between a dependent and one or more independent variables. That's the whole purpose of that thing. If we are not getting a value of those turnovers or missed field goals out of this, the whole analysis is useless. It is obviously the case that the value is in relation to each other, that's what bbstats wanted to say and he is right about that. But we still should be able to draw conclusions upon this. What bbstats was missing here is the fact, that I did not question whether something is positive or negative, but that the relation to each other is not correct. That points to an error in the setup of the regression, not to a bad interpretation.

KAN wrote:Obviously a turnover is worse than a missed field goal; but, that is beside the point.

No, it exactly the whole point of a regression. If, as bbstats implied, multicollinearities are a big issue, the values we derive are worthless, because we can't draw anything out of this. And Jerry wants to derive a certain player value out of those values. If the values are not correct, the player value is incorrect as well. The better predictive power is not a proof that the values are correct, we can simply see "right for the wrong reasons" here. And that's what I meant with a similar trap as Berri ran into with his WP, where Berri does not realize where his "correct predictions" are coming from.

v-zero wrote:It's actually very easy to call into question whether a turnover should be considered as bad as a missed FGA for an individual player, whilst not for a team. Maybe often turnovers occur when there is little to no chance of a decent shot coming off, and the guy turning it over isn't close to the only reason it's being turned over, just the guy who touches it last. Now contrast that with the idea that often players take shots when passes for possibly superior shots are available, suddenly a missed FGA is worse than merely the loss of an average opportunity.

You are in essence describing the issue, but you are wrong about the reasons in reality. The most turnovers are not caused, because a decent shot is not available, but because of bad passes, ball handling mistakes or simply because the defender was better in stealing the ball than the offensive player in terms of taking care of the ball. Also, shotclock violations are not assigned to individual players, but counted as team turnovers anyway.
On the other end about 25% of the shots are late shots taken, because no better shot was available. It is one of the fallacy by stats people, that they think missed shots just have the opportunity costs of a better shot. The real opportunity costs have to include turnovers, because every shot attempt is in essence also a "missed turnover".
Obviously, the regression does not "know" this at all, because nobody told the algorithm that a shot was taken, because a turnover would have been otherwise the consequence. And that is the real reason why turnover are estimated as being less harmful here.

If your results are in disagreement with physic laws, it is likely that the setting is choosen badly, not that the physic laws are wrong. We have to deal with a biased sample, we have to take that into account, otherwise the results and the conclusions we are drawing upon that are simply wrong.

Posted: **Sat Nov 17, 2012 1:19 pm**

mystic wrote:You are in essence describing the issue, but you are wrong about the reasons in reality. The most turnovers are not caused, because a decent shot is not available, but because of bad passes, ball handling mistakes or simply because the defender was better in stealing the ball than the offensive player in terms of taking care of the ball. Also, shotclock violations are not assigned to individual players, but counted as team turnovers anyway.
On the other end about 25% of the shots are late shots taken, because no better shot was available. It is one of the fallacy by stats people, that they think missed shots just have the opportunity costs of a better shot. The real opportunity costs have to include turnovers, because every shot attempt is in essence also a "missed turnover".
Obviously, the regression does not "know" this at all, because nobody told the algorithm that a shot was taken, because a turnover would have been otherwise the consequence. And that is the real reason why turnover are estimated as being less harmful here.

If your results are in disagreement with physic laws, it is likely that the setting is choosen badly, not that the physic laws are wrong. We have to deal with a biased sample, we have to take that into account, otherwise the results and the conclusions we are drawing upon that are simply wrong.

Yes, it was a thought experiment, if you hadn't chosen to cut off the end off my post you would see that I do not actually claim that to be the case.

I don't make any assumptions about what the box-score data will tell me, I just try to give it a reasonable framework to work with and then let the numbers speak. I don't think all missed shots are the same, but the box-score does, and so you can build in usage and see if it improves the predictive ability of your ratings, but that's not proof it does. I don't use standard regression, so I'm not suffering a retrodiction rather than prediction issue.

As for your bit on Physics, well, that's an odd thing to say, because the 'laws' of Physics are tested by experiment and changed when needed - we build models in Physics, we test models, we improve models, we make no assumptions but hope to make decent guesses as to a good framework.

Lastly, bias doesn't make your conclusions wrong, it only limits the certainty with which you can make conclusions, and the range of situations over which they are valid. Depending on the question you ask and the way you ask it you will suffer different biases, but always biases, with an incomplete dataset like the box-score.

Posted: **Sat Nov 17, 2012 9:48 pm**

v-zero wrote: As for your bit on Physics, well, that's an odd thing to say, because the 'laws' of Physics are tested by experiment and changed when needed - we build models in Physics, we test models, we improve models, we make no assumptions but hope to make decent guesses as to a good framework.

First of all, physic laws have a range of validity, they are not changed in almost all cases, but you basically leave the range in which the law was valid. In the latter case you need to add something or need to use a law which is valid. Like for the most problems Newton is completely fine to solve the issue, but if the velocity is big enough, the error will be substantial and you need to use STR. That doesn't make Newton wrong, just not valid for the circumstances. And yes, overall in physics we try to find the best approximation of the reality.
But what I meant was rather students making experiments and using regression to determine the wanted values. In such a case I happened to see people argue that the physical law might actually be "wrong", but in most (almost all) cases such results are much more an indication of a wrongly applied regression than a sign for a "different" physics.

I just wanted to clarify that point, nothing more.

Posted: **Sat Nov 17, 2012 10:26 pm**

mystic wrote:
v-zero wrote: As for your bit on Physics, well, that's an odd thing to say, because the 'laws' of Physics are tested by experiment and changed when needed - we build models in Physics, we test models, we improve models, we make no assumptions but hope to make decent guesses as to a good framework.
First of all, physic laws have a range of validity, they are not changed in almost all cases, but you basically leave the range in which the law was valid. In the latter case you need to add something or need to use a law which is valid. Like for the most problems Newton is completely fine to solve the issue, but if the velocity is big enough, the error will be substantial and you need to use STR. That doesn't make Newton wrong, just not valid for the circumstances. And yes, overall in physics we try to find the best approximation of the reality.
But what I meant was rather students making experiments and using regression to determine the wanted values. In such a case I happened to see people argue that the physical law might actually be "wrong", but in most (almost all) cases such results are much more an indication of a wrongly applied regression than a sign for a "different" physics.

I just wanted to clarify that point, nothing more.

Yes, of course, if you get an unintuitive result from experiment, from regression, you should check your methods and make sure you haven't done anything stupid. However, there are cases where that unintuitive result isn't wrong, but as you say implies the need for a new model under new circumstances, e.g. Massive particles in the Standard Model, or less recently the Ultraviolet Catastrophe...

Posted: **Mon Nov 19, 2012 2:09 pm**

RE: Mystic

Yes, I have tested it (TOs + REBs ~ +/-. )

Your correlation chart:
a) did not involve +/-
b) was not multi-variate.

But perhaps I should have been more specific. Turnovers & Offensive rebounds, versus Offensive RAPM (the 7-yr dataset) shows:

R^2 = 0.085
Intercept = -1.22 (p=0.002)
TOV per 100 = 0.51 (p=4.11*10^-5)
ORB per 100 = -0.22 (p=0.0002)

This doesn't mean that a player with a billion turnovers is better than one with zero. It just means that players who turn the ball over, ONLY COMPARED to players that grab rebounds, tend to help their team offensively (if only slightly, as shown by the low R^2).

Posted: **Mon Nov 19, 2012 3:42 pm**

bbstats wrote:RE: Mystic

Yes, I have tested it (TOs + REBs ~ +/-. )

No, you didn't. And if you have, why don't you show the proof that a higher turnover rate is actually positive in terms of OVERALL RAPM while TOTAL rebounding is negative?

bbstats wrote:It just means that players who turn the ball over, ONLY COMPARED to players that grab rebounds, tend to help their team offensively (if only slightly, as shown by the low R^2).

Which then again is rather your fault here, where you have used an insufficent setup for your regression. As I said, if someone comes up with a result which obviously contradicts the reality, it is hardly the reality which is wrong here. Maybe that's something you missed during the conversation?

Posted: **Tue Nov 20, 2012 12:14 pm**

If people that want the old RAPM back had said so when I started working with priors I'd have an easier time following their arguments. Or if they wanted the vanilla RAPM back, not RiRAPM.

The reason is this: Once you start working with priors you have to ask yourself what you want to use as a prior.
The answer is that you _always_ want to use whatever metric gave the best prediction results in earlier seasons as the prior for RAPM. That way we get the best starting point(s) for RAPM, which will lead to more accurate
player ratings. Giving RAPM a worse set of priors would mean that RAPM would later have a tougher time finding the "true" rating of the player (not always, but in most cases)

johannesdesilentio wrote:I'd appreciate an explanation there. Are you talking about the regularization itself or something else? Which type of player-situations are talking about getting an unfair advantage or disadvantage?

Vanilla RAPM starts out assuming everyone is a 0 and gets penalized for moving ratings away from 0 (0 is the prior). That's where the bias comes from. This has the side effect of not moving low-minute players far away from 0.
There are many problems that arise from that. One would be that high minute players on very bad teams often get a lower rating than the low-minute players of those teams. That's because there's not enough data to convince RAPM
to move the low-minute players' RAPM further down. Since someone has to "take the blame" the high-minute player will get a strong negative rating, even though there's a good chance that he's better than the low minute player (or else he wouldn't get more minutes).
There are obviously more problems that come with it.

Posted: **Tue Nov 20, 2012 12:43 pm**

J.E. wrote: Vanilla RAPM starts out assuming everyone is a 0 and gets penalized for moving ratings away from 0 (0 is the prior). That's where the bias comes from. This has the side effect of not moving low-minute players far away from 0.

Have you ever tried to use a fixed value for the low minute players at the replacement level? Or just start everyone with the replacement level and see what comes out? In that way the low minute players would be assumed to be bad and not average. That might solve a couple of issues.

I also would prefer, if you would generate a RAPM version which is not biased by the boxscore. Especially when you have a height bias included. Just look at your defensive values, Al Jefferson is now considered a positive on defense for 2012, a player who is clearly not a good defender at all. You are obviously taking away defensive value from smaller players and assign that to bigger player who don't warrant it.

To make another thing clear: I would also really like to see the RAPM results, which are given the best predictor. But that is something I would like to see additionally to the previous RAPM version.

Posted: **Tue Nov 20, 2012 5:28 pm**

J.E. wrote:If people that want the old RAPM back had said so when I started working with priors I'd have an easier time following their arguments. Or if they wanted the vanilla RAPM back, not RiRAPM.

The reason is this: Once you start working with priors you have to ask yourself what you want to use as a prior.
The answer is that you _always_ want to use whatever metric gave the best prediction results in earlier seasons as the prior for RAPM. That way we get the best starting point(s) for RAPM, which will lead to more accurate
player ratings. Giving RAPM a worse set of priors would mean that RAPM would later have a tougher time finding the "true" rating of the player (not always, but in most cases)

I'm not sure how much you're talking to me. Might be a lot, might be not at all since I've ceased being a regular on here. Clearly though me not being a regular right now is my issue here. I like using your site regularly, but since I'm not really dabbling with creating stats at this time (in part because folks like yourself can do it better than me) I don't find reason to come over to APBRmetrics all that much.

J.E. wrote:
johannesdesilentio wrote:I'd appreciate an explanation there. Are you talking about the regularization itself or something else? Which type of player-situations are talking about getting an unfair advantage or disadvantage?
Vanilla RAPM starts out assuming everyone is a 0 and gets penalized for moving ratings away from 0 (0 is the prior). That's where the bias comes from. This has the side effect of not moving low-minute players far away from 0.
There are many problems that arise from that. One would be that high minute players on very bad teams often get a lower rating than the low-minute players of those teams. That's because there's not enough data to convince RAPM
to move the low-minute players' RAPM further down. Since someone has to "take the blame" the high-minute player will get a strong negative rating, even though there's a good chance that he's better than the low minute player (or else he wouldn't get more minutes).
There are obviously more problems that come with it.

Thank you, and no doubt.

I just chafe so much at this change because I already have ways to deal with these inadequacies in my analysis process, but when you add this new box score bias in - while removing the previous version - you make the thing become so much of a black box that I don't know what I'll do with it.

Again though: I'm not against your new stat or the direction you're most interested in. I can see value in it, but really no matter how good you get it, I'll never use it exclusively. I need a cocktail of different metrics, so my selfish request is that you try not to remove your previous statistical insights from the public forum.

Posted: **Tue Nov 20, 2012 9:12 pm**

Long time lurker making first post. I thought this was a good time to do so, as I'm sort of the captain of a group that has done some work that is pertinent to this subject. In our testing, the optimal predictive metric consisted of a regularized plus/minus model using a proprietary SPM model as an informed prior (mean-regressed, and with an aging/experience-curve applied). For players with under 400 MP in the previous season, we use a constant (pretty negative) as a prior. For rookies in-season, the "prior" is also rather negative, but depends on MP (ie we let the prior "roll" over the course of the season based on MP and the historical performance of rookies in that MP "bin"). Our purpose (wagering on the NBA) is different than the purpose of most users on this forum, but a player evaluation metric emerged as somewhat of a consequence of our work, and it seems like what we found is quite similar to what is being discussed in this thread. I've avoided posting in the past as I don't want to discuss in depth any of the modeling work that we've done and/or are doing, but in this partiucular case, the discussion is far removed from anything that we do on the wagering front (which is all at the lineup/matchup level, as individual player values are not very valuable for wagering purposes). We've attempted various choices of priors for models such as this, including weigthed past x season SPM, MPG, PER, and a few others. IMO, unless you start to explore really esoteric things like including data from CAC, then there's probably not much more that can be done in terms of improving the "one number" plus/minus-style performance metrics, other than morphing over to the use of a SPM model (with aging curve) as a prior for RAPM.

APBRmetrics

Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals

Re: Prediction with BoxScore totals