Questions and Comments on ASPM

xkonk · Post by **xkonk** » Mon Feb 27, 2012 3:08 pm

Crow wrote:Wins Produced also has regression based weighting of the team impact of individual boxscore performance. So does PER, based on Hollinger's report of using such regression based weighting.

Is this true for PER? The copy of his book that I have makes no mention at all of regression in the background for calculating PER.

Crow · Post by **Crow** » Tue Feb 28, 2012 1:41 am

I recollect that he said it, probably here. I could be wrong but that is my memory. He could say, if he is still stopping by and reading here occasionally.

Crow · Post by **Crow** » Wed Feb 29, 2012 12:53 am

Daniel's split of this material from my more on RAPM thread was probably a good thing. I doubt I'll mess with doing this, but was curious if it was a fairly simple one step operation, whether it required special coding or access privileges, or if you did it manually one post at a time?

bbstats · Post by **bbstats** » Thu Mar 01, 2012 7:15 pm

Re: % of points scored:

Monta Ellis...Stoudemire...DeMar DeRozan...Josh Smith...Luis Scola...Rudy Gay...

No.

Mike G · Post by **Mike G** » Thu Mar 01, 2012 8:27 pm

% of points scored is one factor.
Efficiency of points scored is another.

Usage% is % of possessions used -- mostly scoring attempts, for most players -- but with turnovers thrown in, whether or not they're from attempts to score.

You can subtract/extract TO% from Usg%, and multiply by TS%, to recreate %Pts.
Points = attempts X success%

Or you can bypass the conflation of shots+turnovers and go straight to Points.
After multiplying Pts/36 by 100/OppPPG, TS%/.525, and some other factors, this is my guess at the "best" scorers: What they'd score in 36 minutes for an average team.

Code: Select all

Sco     per36 scoring   tm   Eff%       Sco    per36 rates     tm   Eff%
31.4   James,Lebron    Mia   .611      22.3   Howard,Dwight   Orl   .534
31.2   Bryant,Kobe     LAL   .518      22.3   Ginobili,Manu   SAS   .694
29.2   Durant,Kevin    Okl   .598      22.1   Pierce,Paul     Bos   .534
28.9   Wade,Dwyane     Mia   .556      22.0   Granger,Danny   Ind   .501
27.1   Rose,Derrick    Chi   .548      21.6   Paul,Chris      LAC   .580
26.8  Westbrook,Russel Okl   .538      21.2   Bosh,Chris      Mia   .544
26.8   Bargnani,Andrea Tor   .566      21.1  Stoudemire,Amare NYK   .498
25.8  Aldridge,Lamarcu Por   .544      20.7   Jamison,Antawn  Cle   .481
25.6   Nowitzki,Dirk   Dal   .536      20.7   Johnson,Joe     Atl   .514
25.6   Anthony,Carmelo NYK   .492      20.5   Garnett,Kevin   Bos   .550
24.8   Love,Kevin      Min   .556      20.5   Young,Nick      Was   .506
24.3   Lin,Jeremy      NYK   .566      20.4  Cousins,Demarcus Sac   .486
23.4   Griffin,Blake   LAC   .540      20.4   Monroe,Greg     Det   .549
23.3   Irving,Kyrie    Cle   .561      20.4   Gay,Rudy        Mem   .501
23.1   Boozer,Carlos   Chi   .553      20.3  Jennings,Brandon Mil   .498
22.9   Parker,Tony     SAS   .512      20.2   Millsap,Paul    Uta   .543
22.7   Williams,Deron  NJN   .531      20.2   Scola,Luis      Hou   .512
22.7   Jefferson,Al    Uta   .505      20.2   Bynum,Andrew    LAL   .565
22.6   Anderson,Ryan   Orl   .595      20.1  Billups,Chauncey LAC   .543
22.5   Williams,Lou    Phi   .514      20.1  Gallinari,Danilo Den   .583
22.4   Ellis,Monta     GSW   .514      20.1   Duncan,Tim      SAS   .506
22.4   Martin,Kevin    Hou   .557      20.0   Curry,Stephen   GSW   .593

Not updated since allstar break.
No sign of DeRozan or Josh Smith. They're all "scorers", though.

bbstats · Post by **bbstats** » Thu Mar 01, 2012 8:29 pm

With context, I agree it can be beneficial...but you said it was the most important stat.

And why guess? Why not regress against adjusted +/- ?

Mike G · Post by **Mike G** » Thu Mar 01, 2012 8:47 pm

I wrote:

How can the most important team statistic not be one of the most important individual statistics?

Regressions will get you an average value, as if all contexts (team situations) were the same. They aren't the same: players start or come off the bench; they (and their teammates) have higher or lower TS%; their teams give up more and fewer points; etc.

We cannot know what Monta Ellis would do if you swapped him for Mario Chalmers, or Mo Williams, or Jason Terry. Regression won't reveal that.

bbstats · Post by **bbstats** » Wed Mar 07, 2012 3:21 pm

Mike G wrote:Regression won't reveal that

Challenge accepted!

The biggest gap in offensive APM is its inability to understand usage, and that certain players are point "creators" (Collison) and some are "finishers" (Durant).

Chilltown · Post by **Chilltown** » Thu Mar 08, 2012 4:36 am

Mike is right: OLS regression will yield an average value. But who said we immediately should run to OLS regression? I'm not singling out Nathan here, just wondering why OLS is privileged so much when we think about modeling in a "causal" framework.

bbstats · Post by **bbstats** » Tue Mar 27, 2012 3:40 pm

I agree whole-heartedly with the premise that the "average value" can be quite weak. For example, I've currently been working on a framework that only looks at Traded Players to find which statistics carry over the best from team to team (based on Neil Paine's work on using advanced metrics to predict teams with a lot of trades etc...Alternate Win Score was king there).

We all know that Empirically:
Stoudemire without Nash = Not as good!
Lebron without Wade + Bosh = More production

And the "True Score" of a player is certainly regressed by any good measure (I suppose its a Bayesian side-effect of rAPM).
I think we all understand that there are multiple interactions going on in basketball, but to say that "regression won't reveal that" feels a bit short-sighted. How not? I think Eli Witus and Jerry E have used regression very well in explaining diminishing returns, for example.

OLS is king for a number of pretty obvious reasons. The most important being that it is easy.

Fitting up OLS for per-possession statistics against rAPM is a fun gig because of
-the low noise (thousands and thousands and thousands of possessions)
-trend identification (true value of usage compared to efficiency, for example)
-much lower computational effort: predicting team ORTG with individual ORTG on the lineup level, for example (like Eli Witus has done), takes a lot of time and effort for us non-programmers, requires a lot of CPU (wish I had a quad-core...), and is not easily tweaked. I can make 20 different iterations of an OLS in 30 minutes and come up with something fun and predictive.

Although I suppose lineup prediction is an "OLS" as well...
I typically try to find non-linear interactions in data, and Daniel used to include non-linear terms in his ASPM, but he determined that it overfit.

What would you like to see more, rather than OLS?

Mike G · Post by **Mike G** » Tue Mar 27, 2012 6:52 pm

DSMok1 wrote:
Mike G wrote: Why not % of points scored?
... scoring 20 in 90-85 games is surely bigger than scoring 20 in 115-110 games.
The scoring component is the most complex part of the regression. I'm using TS% and USG%, which are both per possession, which is better than % of points scored.

I guess this is the scoring component:

e*USG%*[TS%*2*(1-TO%) – f*TO% – g + h*AST% + i*USG%]

Usage includes both shots and turnovers, and it seems the turnovers are stripped out here [TS%*2*(1-TO%) ], and then a different coefficient [-f] is applied to TO.

Then TS%*2*Shot% is actually % of a team's points scored, I believe.

xkonk · Post by **xkonk** » Wed Mar 28, 2012 12:48 am

Assuming the numbers come straight from bball-reference, and those numbers follow the equations in the glossary http://www.basketball-reference.com/about/glossary.html, the equations don't turn out quite that nicely. For ease of writing, let's say turnovers=TOV, points=PTS, FGA+.44*FTA=STS (for shots), and FGA+.44*FTA+TOV=POS. POS can also be at the team level for usage, TmPOS. We can also throw in minutes=Min and team minutes=TmMin.

The first term is e*USG*TS%*2*(1-TO%) = e*100*(POS/TmPOS)*(TmMin/Min)*PTS/(2*STS)*2*(1-TOV/POS). 1-TOV/POS should be the same as STS/POS, so the 2, POS, and STS cancel and you have e*100*(TmMin/Min)*PTS/TmPOS. That could be something like % of team points scored, but it isn't how I would calculate it since team points don't appear anywhere. Closer to player points per team possession?

The second term is e*f*USG*TO% = e*f*100*(POS/TmPOS)*(TmMin/Min)*TOV/POS = e*f*(TmMin/Min)*TOV/TmPOS. Again, maybe something like player turnovers per team possession.

The third term is just weighted usage and the fifth weighted usage squared, and the fourth term is some manner of usage-assist interaction that doesn't break down because assist% isn't calculated per-possession.

Mike G · Post by **Mike G** » Fri Mar 30, 2012 1:22 pm

DSMok1 wrote:... the value of points scored depends on how many possessions were used to generate them, and the defense on the other end is not related to the offense...

A couple of problems with this one phrase:

1) It's surely not a trivial interaction that the more energy a player expends at one end of the floor, the less he has on the other end. If you've played, you know this. It isn't even arguable.

2) Points per possession is not absolute in its value. In a defensive playoff series, for example, .480 shooting may be good enough to advance; in the next series, vs a high-powered offense, .580 may not be good enough.

Taking the TO out of Usg%, and then multiplying the 'scoring component' by TS%, you get rather a skewed version of " % of points scored". More straightforward, and surely more correlated to winning, would be to not conflate shots+turnovers to begin with.

I've never understood why "Scoring Rate" cannot be an "Advanced Stat". Is it just the old "scoring is overrated" cliche'?

Crow · Post by **Crow** » Sat Apr 07, 2012 3:39 am

"Scoring Rate" can be considered an "Advanced Stat" because it is complex and ambitious. But like other advanced stats it may mis-assign value between players, in this case for the defensive component. Sometimes modestly, sometimes greatly.

Mike G · Post by **Mike G** » Sat Apr 07, 2012 10:11 am

Points by both teams may be defense, or it may be pace.
Pts/Poss * Poss/G = Pts/G

With or without a team/opp Pts/G adjustment, players on a given team would rank the same.
With the adjustment, players from high-paced (or weak D) teams wouldn't be overrated relative to those on low-paced (or strong D) teams.

Just as a team that averages 110 points is not favored to beat a team that averages 100 -- it depends just as much on their points allowed -- the players on the first team are not expected to outscore their counterparts on the 2nd team.
That also depends on the scoring they allow.

If in midseason your coach is fired, and he was laissez-faire about defense and also liked to run;
and the replacement is the opposite type, stressing D and more regulated offense;
you can expect fewer points from everyone after the coaching change.
But relative to the changes in scoring environment, you can expect everyone to be about the same in scoring. And everyone is the same player, so that sounds about right.

APBRmetrics

Questions and Comments on ASPM

Re: More with RAPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM

Re: More with RAPM

Re: Questions and Comments on ASPM

Re: More with RAPM

Re: Questions and Comments on ASPM

Re: Questions and Comments on ASPM