Is this true for PER? The copy of his book that I have makes no mention at all of regression in the background for calculating PER.Crow wrote:Wins Produced also has regression based weighting of the team impact of individual boxscore performance. So does PER, based on Hollinger's report of using such regression based weighting.
Questions and Comments on ASPM
Re: More with RAPM
Re: Questions and Comments on ASPM
I recollect that he said it, probably here. I could be wrong but that is my memory. He could say, if he is still stopping by and reading here occasionally.
Re: Questions and Comments on ASPM
Daniel's split of this material from my more on RAPM thread was probably a good thing. I doubt I'll mess with doing this, but was curious if it was a fairly simple one step operation, whether it required special coding or access privileges, or if you did it manually one post at a time?
Re: Questions and Comments on ASPM
Re: % of points scored:
Monta Ellis...Stoudemire...DeMar DeRozan...Josh Smith...Luis Scola...Rudy Gay...
No.
Monta Ellis...Stoudemire...DeMar DeRozan...Josh Smith...Luis Scola...Rudy Gay...
No.
Re: Questions and Comments on ASPM
% of points scored is one factor.
Efficiency of points scored is another.
Usage% is % of possessions used -- mostly scoring attempts, for most players -- but with turnovers thrown in, whether or not they're from attempts to score.
You can subtract/extract TO% from Usg%, and multiply by TS%, to recreate %Pts.
Points = attempts X success%
Or you can bypass the conflation of shots+turnovers and go straight to Points.
After multiplying Pts/36 by 100/OppPPG, TS%/.525, and some other factors, this is my guess at the "best" scorers: What they'd score in 36 minutes for an average team.Not updated since allstar break.
No sign of DeRozan or Josh Smith. They're all "scorers", though.
Efficiency of points scored is another.
Usage% is % of possessions used -- mostly scoring attempts, for most players -- but with turnovers thrown in, whether or not they're from attempts to score.
You can subtract/extract TO% from Usg%, and multiply by TS%, to recreate %Pts.
Points = attempts X success%
Or you can bypass the conflation of shots+turnovers and go straight to Points.
After multiplying Pts/36 by 100/OppPPG, TS%/.525, and some other factors, this is my guess at the "best" scorers: What they'd score in 36 minutes for an average team.
Code: Select all
Sco per36 scoring tm Eff% Sco per36 rates tm Eff%
31.4 James,Lebron Mia .611 22.3 Howard,Dwight Orl .534
31.2 Bryant,Kobe LAL .518 22.3 Ginobili,Manu SAS .694
29.2 Durant,Kevin Okl .598 22.1 Pierce,Paul Bos .534
28.9 Wade,Dwyane Mia .556 22.0 Granger,Danny Ind .501
27.1 Rose,Derrick Chi .548 21.6 Paul,Chris LAC .580
26.8 Westbrook,Russel Okl .538 21.2 Bosh,Chris Mia .544
26.8 Bargnani,Andrea Tor .566 21.1 Stoudemire,Amare NYK .498
25.8 Aldridge,Lamarcu Por .544 20.7 Jamison,Antawn Cle .481
25.6 Nowitzki,Dirk Dal .536 20.7 Johnson,Joe Atl .514
25.6 Anthony,Carmelo NYK .492 20.5 Garnett,Kevin Bos .550
24.8 Love,Kevin Min .556 20.5 Young,Nick Was .506
24.3 Lin,Jeremy NYK .566 20.4 Cousins,Demarcus Sac .486
23.4 Griffin,Blake LAC .540 20.4 Monroe,Greg Det .549
23.3 Irving,Kyrie Cle .561 20.4 Gay,Rudy Mem .501
23.1 Boozer,Carlos Chi .553 20.3 Jennings,Brandon Mil .498
22.9 Parker,Tony SAS .512 20.2 Millsap,Paul Uta .543
22.7 Williams,Deron NJN .531 20.2 Scola,Luis Hou .512
22.7 Jefferson,Al Uta .505 20.2 Bynum,Andrew LAL .565
22.6 Anderson,Ryan Orl .595 20.1 Billups,Chauncey LAC .543
22.5 Williams,Lou Phi .514 20.1 Gallinari,Danilo Den .583
22.4 Ellis,Monta GSW .514 20.1 Duncan,Tim SAS .506
22.4 Martin,Kevin Hou .557 20.0 Curry,Stephen GSW .593
No sign of DeRozan or Josh Smith. They're all "scorers", though.
Re: Questions and Comments on ASPM
With context, I agree it can be beneficial...but you said it was the most important stat.
And why guess? Why not regress against adjusted +/- ?
And why guess? Why not regress against adjusted +/- ?

Re: Questions and Comments on ASPM
I wrote:
We cannot know what Monta Ellis would do if you swapped him for Mario Chalmers, or Mo Williams, or Jason Terry. Regression won't reveal that.
Regressions will get you an average value, as if all contexts (team situations) were the same. They aren't the same: players start or come off the bench; they (and their teammates) have higher or lower TS%; their teams give up more and fewer points; etc.How can the most important team statistic not be one of the most important individual statistics?
We cannot know what Monta Ellis would do if you swapped him for Mario Chalmers, or Mo Williams, or Jason Terry. Regression won't reveal that.
Re: Questions and Comments on ASPM
Challenge accepted!Mike G wrote:Regression won't reveal that

The biggest gap in offensive APM is its inability to understand usage, and that certain players are point "creators" (Collison) and some are "finishers" (Durant).
Re: Questions and Comments on ASPM
Mike is right: OLS regression will yield an average value. But who said we immediately should run to OLS regression? I'm not singling out Nathan here, just wondering why OLS is privileged so much when we think about modeling in a "causal" framework.
Re: Questions and Comments on ASPM
I agree whole-heartedly with the premise that the "average value" can be quite weak. For example, I've currently been working on a framework that only looks at Traded Players to find which statistics carry over the best from team to team (based on Neil Paine's work on using advanced metrics to predict teams with a lot of trades etc...Alternate Win Score was king there).
We all know that Empirically:
Stoudemire without Nash = Not as good!
Lebron without Wade + Bosh = More production
And the "True Score" of a player is certainly regressed by any good measure (I suppose its a Bayesian side-effect of rAPM).
I think we all understand that there are multiple interactions going on in basketball, but to say that "regression won't reveal that" feels a bit short-sighted. How not? I think Eli Witus and Jerry E have used regression very well in explaining diminishing returns, for example.
OLS is king for a number of pretty obvious reasons. The most important being that it is easy.
Fitting up OLS for per-possession statistics against rAPM is a fun gig because of
-the low noise (thousands and thousands and thousands of possessions)
-trend identification (true value of usage compared to efficiency, for example)
-much lower computational effort: predicting team ORTG with individual ORTG on the lineup level, for example (like Eli Witus has done), takes a lot of time and effort for us non-programmers, requires a lot of CPU (wish I had a quad-core...), and is not easily tweaked. I can make 20 different iterations of an OLS in 30 minutes and come up with something fun and predictive.
Although I suppose lineup prediction is an "OLS" as well...
I typically try to find non-linear interactions in data, and Daniel used to include non-linear terms in his ASPM, but he determined that it overfit.
What would you like to see more, rather than OLS?
We all know that Empirically:
Stoudemire without Nash = Not as good!
Lebron without Wade + Bosh = More production
And the "True Score" of a player is certainly regressed by any good measure (I suppose its a Bayesian side-effect of rAPM).
I think we all understand that there are multiple interactions going on in basketball, but to say that "regression won't reveal that" feels a bit short-sighted. How not? I think Eli Witus and Jerry E have used regression very well in explaining diminishing returns, for example.
OLS is king for a number of pretty obvious reasons. The most important being that it is easy.
Fitting up OLS for per-possession statistics against rAPM is a fun gig because of
-the low noise (thousands and thousands and thousands of possessions)
-trend identification (true value of usage compared to efficiency, for example)
-much lower computational effort: predicting team ORTG with individual ORTG on the lineup level, for example (like Eli Witus has done), takes a lot of time and effort for us non-programmers, requires a lot of CPU (wish I had a quad-core...), and is not easily tweaked. I can make 20 different iterations of an OLS in 30 minutes and come up with something fun and predictive.
Although I suppose lineup prediction is an "OLS" as well...
I typically try to find non-linear interactions in data, and Daniel used to include non-linear terms in his ASPM, but he determined that it overfit.
What would you like to see more, rather than OLS?
Re: More with RAPM
I guess this is the scoring component:DSMok1 wrote:The scoring component is the most complex part of the regression. I'm using TS% and USG%, which are both per possession, which is better than % of points scored.Mike G wrote: Why not % of points scored?
... scoring 20 in 90-85 games is surely bigger than scoring 20 in 115-110 games.
Usage includes both shots and turnovers, and it seems the turnovers are stripped out here [TS%*2*(1-TO%) ], and then a different coefficient [-f] is applied to TO.e*USG%*[TS%*2*(1-TO%) – f*TO% – g + h*AST% + i*USG%]
Then TS%*2*Shot% is actually % of a team's points scored, I believe.
Re: Questions and Comments on ASPM
Assuming the numbers come straight from bball-reference, and those numbers follow the equations in the glossary http://www.basketball-reference.com/about/glossary.html, the equations don't turn out quite that nicely. For ease of writing, let's say turnovers=TOV, points=PTS, FGA+.44*FTA=STS (for shots), and FGA+.44*FTA+TOV=POS. POS can also be at the team level for usage, TmPOS. We can also throw in minutes=Min and team minutes=TmMin.
The first term is e*USG*TS%*2*(1-TO%) = e*100*(POS/TmPOS)*(TmMin/Min)*PTS/(2*STS)*2*(1-TOV/POS). 1-TOV/POS should be the same as STS/POS, so the 2, POS, and STS cancel and you have e*100*(TmMin/Min)*PTS/TmPOS. That could be something like % of team points scored, but it isn't how I would calculate it since team points don't appear anywhere. Closer to player points per team possession?
The second term is e*f*USG*TO% = e*f*100*(POS/TmPOS)*(TmMin/Min)*TOV/POS = e*f*(TmMin/Min)*TOV/TmPOS. Again, maybe something like player turnovers per team possession.
The third term is just weighted usage and the fifth weighted usage squared, and the fourth term is some manner of usage-assist interaction that doesn't break down because assist% isn't calculated per-possession.
The first term is e*USG*TS%*2*(1-TO%) = e*100*(POS/TmPOS)*(TmMin/Min)*PTS/(2*STS)*2*(1-TOV/POS). 1-TOV/POS should be the same as STS/POS, so the 2, POS, and STS cancel and you have e*100*(TmMin/Min)*PTS/TmPOS. That could be something like % of team points scored, but it isn't how I would calculate it since team points don't appear anywhere. Closer to player points per team possession?
The second term is e*f*USG*TO% = e*f*100*(POS/TmPOS)*(TmMin/Min)*TOV/POS = e*f*(TmMin/Min)*TOV/TmPOS. Again, maybe something like player turnovers per team possession.
The third term is just weighted usage and the fifth weighted usage squared, and the fourth term is some manner of usage-assist interaction that doesn't break down because assist% isn't calculated per-possession.
Re: More with RAPM
A couple of problems with this one phrase:DSMok1 wrote:... the value of points scored depends on how many possessions were used to generate them, and the defense on the other end is not related to the offense...
1) It's surely not a trivial interaction that the more energy a player expends at one end of the floor, the less he has on the other end. If you've played, you know this. It isn't even arguable.
2) Points per possession is not absolute in its value. In a defensive playoff series, for example, .480 shooting may be good enough to advance; in the next series, vs a high-powered offense, .580 may not be good enough.
Taking the TO out of Usg%, and then multiplying the 'scoring component' by TS%, you get rather a skewed version of " % of points scored". More straightforward, and surely more correlated to winning, would be to not conflate shots+turnovers to begin with.
I've never understood why "Scoring Rate" cannot be an "Advanced Stat". Is it just the old "scoring is overrated" cliche'?
Re: Questions and Comments on ASPM
"Scoring Rate" can be considered an "Advanced Stat" because it is complex and ambitious. But like other advanced stats it may mis-assign value between players, in this case for the defensive component. Sometimes modestly, sometimes greatly.
Re: Questions and Comments on ASPM
Points by both teams may be defense, or it may be pace.
Pts/Poss * Poss/G = Pts/G
With or without a team/opp Pts/G adjustment, players on a given team would rank the same.
With the adjustment, players from high-paced (or weak D) teams wouldn't be overrated relative to those on low-paced (or strong D) teams.
Just as a team that averages 110 points is not favored to beat a team that averages 100 -- it depends just as much on their points allowed -- the players on the first team are not expected to outscore their counterparts on the 2nd team.
That also depends on the scoring they allow.
If in midseason your coach is fired, and he was laissez-faire about defense and also liked to run;
and the replacement is the opposite type, stressing D and more regulated offense;
you can expect fewer points from everyone after the coaching change.
But relative to the changes in scoring environment, you can expect everyone to be about the same in scoring. And everyone is the same player, so that sounds about right.
Pts/Poss * Poss/G = Pts/G
With or without a team/opp Pts/G adjustment, players on a given team would rank the same.
With the adjustment, players from high-paced (or weak D) teams wouldn't be overrated relative to those on low-paced (or strong D) teams.
Just as a team that averages 110 points is not favored to beat a team that averages 100 -- it depends just as much on their points allowed -- the players on the first team are not expected to outscore their counterparts on the 2nd team.
That also depends on the scoring they allow.
If in midseason your coach is fired, and he was laissez-faire about defense and also liked to run;
and the replacement is the opposite type, stressing D and more regulated offense;
you can expect fewer points from everyone after the coaching change.
But relative to the changes in scoring environment, you can expect everyone to be about the same in scoring. And everyone is the same player, so that sounds about right.