2008-2009 Adjusted Plus-Minus Ratings (Low-Noise Version)

DSMok1 · Post by **DSMok1** » Wed May 18, 2011 9:12 pm

This is a RECOVERED THREAD
Originally posted by Ilardi, Aug 09, 2009

Recovered Page 1 of 3

Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Sun Aug 09, 2009 6:30 pm Post subject: 2008-2009 Adjusted Plus-Minus Ratings (Low-Noise Version) Reply with quote
I've now had a chance to generate 2008-2009 APM ratings, using a full six-year (03-09) lineup dataset to greatly reduce estimation error.

Specifically, I've given each prior season a fractional weighting of the form:

weight = 1/(2^(YearsAgo +1)

This generates the following season-by-season weighting scheme:

2008-2009 = 1
2007-2008 = 1/4
2006-2007 = 1/8
2005-2006 = 1/16
2004-2005 = 1/32
2003-2004 = 1/64

Note that the resulting model still accords nearly 70% of the overall weight to the 2008-2009 season, with much of the rest of the weight coming from the preceding season (and all weightings tapering off exponentially as a function of time).

Obviously, the ideal is an APM model based 100% on the target 08-09 season, but (as we've seen in the past - and as posted at bv.com), such a model yields estimation errors so high as to render the estimates of only limited value. By including prior seasons' data in the model (at reduced weight), we're able to dramatically reduce estimation error, and still allow the results to primarily reflect the target season.

If interested, you can find the latest APM estimates posted at: http://spreadsheets.google.com/ccc?key= ... 4WkE&hl=en

- Steve
________________________________________________________
Ryan J. Parker

Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC

PostPosted: Sun Aug 09, 2009 6:38 pm Post subject: Reply with quote
Here's a top 20 list:

Code:

Code: Select all

Wade, Dwayne   13.61
Garnett, Kevin   13.21
James, LeBron   13.19
Paul, Chris   12.71
Nash, Steve   8.83
Odom, Lamar   8.81
Iguodala, Andre   8.61
Lewis, Rashard   8.11
Ming, Yao   7.29
Kidd, Jason   6.66
Gasol, Pau   6.64
Nowitzki, Dirk   6.5
Young, Thaddeus   6.37
Bosh, Chris   6.19
Johnson, Amir   6.18
Artest, Ron   5.83
Parker, Tony   5.77
Bryant, Kobe   5.63
Jamison, Antawn   5.58
Duncan, Tim   5.4

_________________
I am a basketball geek.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Sun Aug 09, 2009 6:42 pm Post subject: Reply with quote
Ryan J. Parker wrote:
Here's a top 20 list:

Code:

Code: Select all

Wade, Dwayne   13.61
Garnett, Kevin   13.21
James, LeBron   13.19
Paul, Chris   12.71
Nash, Steve   8.83
Odom, Lamar   8.81
Iguodala, Andre   8.61
Lewis, Rashard   8.11
Ming, Yao   7.29
Kidd, Jason   6.66
Gasol, Pau   6.64
Nowitzki, Dirk   6.5
Young, Thaddeus   6.37
Bosh, Chris   6.19
Johnson, Amir   6.18
Artest, Ron   5.83
Parker, Tony   5.77
Bryant, Kobe   5.63
Jamison, Antawn   5.58
Duncan, Tim   5.4

Face validity?
________________________________________________________
Ryan J. Parker

Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC

PostPosted: Sun Aug 09, 2009 6:48 pm Post subject: Reply with quote
Steve, you mentioned making predictions before, and I wanted to know what you think is the most important to predict. In the coming year, what would you want to predict? Individual player APM ratings, actual lineup efficiency, etc?
_________________
I am a basketball geek.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Sun Aug 09, 2009 9:20 pm Post subject: Reply with quote
Ryan J. Parker wrote:
Steve, you mentioned making predictions before, and I wanted to know what you think is the most important to predict. In the coming year, what would you want to predict? Individual player APM ratings, actual lineup efficiency, etc?

I guess I'm mostly interested in predicting:

(a) individual APM as a function of player history and age
(b) team W-L records/efficiency (especially for teams with major personnel change from preceding season)
________________________________________________________
Ryan J. Parker

Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC

PostPosted: Sun Aug 09, 2009 10:23 pm Post subject: Reply with quote
Cool. I'm interested in predicting team offensive and defensive efficiency, so I want to set a baseline for how a very basic model, like adjusted plus/minus or similar variant, predicts these team measures. Then the fun is figuring out how to improve on those predictions, and figuring out what models do and do not predict well.
_________________
I am a basketball geek.
________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Mon Aug 10, 2009 8:40 am Post subject: Reply with quote
I just looked over the numbers... what I can see is that out of 341 players, there are 75 players that we can say with 95% confidence are above league average. (That is 2 StdDev, or the player's mean - 2*stderr still being above 0.) Is that interpretation correct? And there are 107 that we can say with 95% confidence are below league average (or 0 +/-). There are 9 players that are 95% likely above 5.0 APM, and 3 above 10 APM (Wade, Garnett, James).

And there are 158 players that we cannot say are below or above average with 95% certainty.
________________________________________________________
Ryan J. Parker

Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC

PostPosted: Mon Aug 10, 2009 9:12 am Post subject: Reply with quote
I think the only change I would make is:

DSMok1 wrote:
And there are 158 players that we cannot say are below or above average with 95% confidence.

Cool
_________________
I am a basketball geek.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Mon Aug 10, 2009 9:23 am Post subject: Reply with quote
DSMok1 wrote:
I just looked over the numbers... what I can see is that out of 341 players, there are 75 players that we can say with 95% confidence are above league average. (That is 2 StdDev, or the player's mean - 2*stderr still being above 0.) Is that interpretation correct? And there are 107 that we can say with 95% confidence are below league average (or 0 +/-). There are 9 players that are 95% likely above 5.0 APM, and 3 above 10 APM (Wade, Garnett, James).

And there are 158 players that we cannot say are below or above average with 95% certainty.

Yes, that interpretive stance is basically correct, although probably a bit too conservative, as it's based on a "two-tailed" test (which yields a 95% interval of +/- 1.96 se) for all coefficient estimates: it's permissible instead to use one-tailed tests (95% at 1.65 se) when we're only interested in testing directional hypotheses (e.g., "I've observed a player's estimate at +1.70 with an se of 1.0, and want to know if he's truly above-average"), as opposed to constructing a two-tailed confidence interval each time.

It actually gets a little more complicated than that, though, because the se's I've provided apply only to the offensive and defensive APM components. To make a long story short, there are two different ways we can come up with the full set of Offensive, Defensive, and Total APM estimates, either: (a) directly estimate Total APM together with an offense-defense adjustment/offset parameter (OffDiff), and then derive Offensive and Defensive APMs based on linear combinations of the aforementioned (Rosenbaum's approach); or (b) directly estimate Offensive and Defensive APM, and then add them together to derive Total APM estimates (my approach*).

In Dan Rosenbaum's approach, you wind up with larger se's for the Offensive and Defensive APM components (since they're not directly estimated in the model, but derived indirectly from other estimates - a process that involves adding se terms), whereas with my approach you wind up with lower se's for Offensive and Defensive APM but a higher se for the Total APM estimate (for the same reason). My approach/model does, however, have the advantage of adjusting each offensive lineup to account for the defensive efficiency of each opposing lineup, and vice versa, which is the main reason I prefer it.

Now, how much higher is the se for the Total APM in my model? It's equal to roughly 1.41 (square root of 2) times the se for the offensive/defensive APM estimates. (And, yes, for technical reasons, the se for each player's offensive APM is always nearly identical to the se for his defensive APM).

If enough people are interested, I could always go back and do a direct estimate for Total APM in order to bring down those se's . . . though the resulting APM estimates would, of course, be very close to those already provided.

*developed in collaboration with Aaron Barzilai (who also came up with the same basic idea at roughly the same time).

Last edited by Ilardi on Mon Aug 10, 2009 9:30 am; edited 1 time in total
________________________________________________________
Ryan J. Parker

Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC

PostPosted: Mon Aug 10, 2009 9:30 am Post subject: Reply with quote
Good catch Steve. One tailed tests FTW.

This part confused me though:

Quote:
Now, how much higher is the se for the Total APM in my model? It's equal to roughly 1.41 (square root of 2) times the se for the offensive/defensive APM estimates. (And, yes, the se for each player's offensive APM is always identical to the se for his defensive APM).

Why is the SE for each player's offensive APM always identical to the SE for his defensive APM?
_________________
I am a basketball geek.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Mon Aug 10, 2009 9:36 am Post subject: Reply with quote
Ryan J. Parker wrote:
Good catch Steve. One tailed tests FTW.

This part confused me though:

Quote:
Now, how much higher is the se for the Total APM in my model? It's equal to roughly 1.41 (square root of 2) times the se for the offensive/defensive APM estimates. (And, yes, the se for each player's offensive APM is always identical to the se for his defensive APM).

Why is the SE for each player's offensive APM always identical to the SE for his defensive APM?

Yeah, that puzzled me at first, too. It seems to be based on the fact that se's in APM modeling are driven by the intercorrelatedness of player oncourt minutes - a phenomenon which will be virtually identical for any given set of players whether on offense or defense. (For example, if Duncan shares 71% of his oncourt offensive minutes with Parker, he will also share almost exactly 71% of his oncourt defensive minutes with him, as well.)
________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Mon Aug 10, 2009 9:58 am Post subject: Reply with quote
Quote:
If enough people are interested, I could always go back and do a direct estimate for Total APM in order to bring down those se's . . . though the resulting APM estimates would, of course, be very close to those already provided.

That could be interesting--how hard would it be to have both shown next to each other...your current and then this elaboration?

Thank you very much for your discussion of two-tailed vs. one-tailed confidence intervals... For some reason I missed that, though it should have been obvious!

EDIT:
So now there are, at 95% confidence, 60 above-average offensive players, 79 above-average defensive players, and (because of the 1.41*stderr) 65 above-average overall players.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Mon Aug 10, 2009 1:29 pm Post subject: Reply with quote
Quote:
EDIT:
So now there are, at 95% confidence, 60 above-average offensive players, 79 above-average defensive players, and (because of the 1.41*stderr) 65 above-average overall players.

That certainly sounds plausible . . . Also, if you go back and look at the 6-year average ratings I posted yesterday, you'll find even lower se terms, leading to an even larger number of players about whom you can derive inferences regarding above-average and below average performance at a 95% confidence level.
________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Mon Aug 10, 2009 1:50 pm Post subject: Reply with quote
Ilardi wrote:
Quote:

So now there are, at 95% confidence, 60 above-average offensive players, 79 above-average defensive players, and (because of the 1.41*stderr) 65 above-average overall players.

That certainly sounds plausible . . . Also, if you go back and look at the 6-year average ratings I posted yesterday, you'll find even lower se terms, leading to an even larger number of players about whom you can derive inferences regarding above-average and below average performance at a 95% confidence level.

Oddly enough, no, that is not the case. Perhaps due to regression to the mean over that period (fewer players being good over the WHOLE six years) the number above average (95% confidence) in each of the 3 categories is between 57 and 59.
________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Mon Aug 10, 2009 1:52 pm Post subject: Reply with quote
Fascinating. Thanks for the clarification!
________________________________________________________

DSMok1 · Post by **DSMok1** » Wed May 18, 2011 9:16 pm

Recovered page 2 of 3

DJE09

Joined: 05 May 2009
Posts: 148

PostPosted: Tue Aug 11, 2009 11:45 pm Post subject: Reply with quote
from the Spreadsheet:
Quote:
Playoffs for each season are accorded double the regular-season weighting

I am not sure I like this misxing when you are scoring relative to zero. All 16 teams in the playoffs are average or above. But the teams that loose in the playoffs will be accorded a net negative weighting in terms of +-?

How about the teams / players that don't make the playoffs - you score them nothing i.e. Zero for their playoff contribution, i.e. they are better than the players playing...but then their team didn't make the playoffs.

Now I understand you are controlling for the line-up people play in and face, but I don't understand why you can compare people who didn't play in the playoffs against players who had to go 7 games against the Lakers eg. Houston
_________________
Stats are only OK if they agree with what I think Smile
Back to top
_________________________________________________________________________________________
Mike G

Joined: 14 Jan 2005
Posts: 3617
Location: Hendersonville, NC

PostPosted: Wed Aug 12, 2009 4:57 am Post subject: Reply with quote
DJE09 wrote:

Now I understand you are controlling for the line-up people play in and face, but I don't understand why you can compare people who didn't play in the playoffs against players who had to go 7 games against the Lakers eg.

Unless you fail to 'cover the spread' vs your opposition, you can still have a net positive, I would think. If the Lakers have beaten average opposition by 7 Pts/48, and you lose by 4, you shouldn't net a negative.
_________________
`
36% of all statistics are wrong
Back to top
_________________________________________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Wed Aug 12, 2009 8:27 am Post subject: Reply with quote
DJE09 wrote:
from the Spreadsheet:
Quote:
Playoffs for each season are accorded double the regular-season weighting

I am not sure I like this misxing when you are scoring relative to zero. All 16 teams in the playoffs are average or above. But the teams that loose in the playoffs will be accorded a net negative weighting in terms of +-?

How about the teams / players that don't make the playoffs - you score them nothing i.e. Zero for their playoff contribution, i.e. they are better than the players playing...but then their team didn't make the playoffs.

Now I understand you are controlling for the line-up people play in and face, but I don't understand why you can compare people who didn't play in the playoffs against players who had to go 7 games against the Lakers eg. Houston

I believe your concerns are based on a misunderstanding (or partial understanding) of the APM model:

1) Because the model explicitly controls for the strength of each player's teammates and opponents during each possession, there is no risk of unfairly penalizing playoff losers for having been outperformed by stronger teams. In fact, it is quite possible for a player's APM rating to increase when playing on a losing team.

2) Those who play for teams that miss the playoffs are not penalized in the APM model - at least not in the sense of lowering their APM ratings. The only detriment for them is the mere fact of having fewer observations (data points) available to derive their APM estimates, with the result that the standard errors for their respective ratings will be slightly higher.

The basic premise of APM modeling is to use all data at our disposal to derive estimates of each player's impact on the game's bottom line; thus, we certainly want to incorporate the valuable information inherent in playoff performance.
Back to top
_________________________________________________________________________________________
DJE09

Joined: 05 May 2009
Posts: 148

PostPosted: Wed Aug 12, 2009 7:19 pm Post subject: Reply with quote
Ilardi wrote:
1) ... In fact, it is quite possible for a player's APM rating to increase when playing on a losing team.

2) Those who play for teams that miss the playoffs are not penalized in the APM model - at least not in the sense of lowering their APM ratings. The only detriment for them is the mere fact of having fewer observations (data points) available to derive their APM estimates, with the result that the standard errors for their respective ratings will be slightly higher.

I am certain I don't properly understand how APM is obtained, but I was working from the assumption that
Quote:
Adjusted +/- ratings indicate how many additional points are contributed to a team's scoring margin by a given player in comparison to the league-average player over the span of a typical game (100 offensive and defensive possessions).

I appreciate that an individual players APM can increase, but I can't see how, for all players on a TEAM their APM can increase when they loose a 7 game series.

OK, I am making this assumption:

The Minute Weighted Sum of a Team's Members APM is an approximation of their expected margin of victory over a "League-average" team.

So for teams like Orlando and Cleveland we should expect generally for their players to have positive APM (I know some can be negative, it just means others are more positive).

I can't see how adding in a series of games against tough opposition, where your team's margin of vistory is less (even negative) is not penalising their APM score relative to the teams who don't play those games (and so get the proxy zero return).

Example time, from http://basketballvalue.com we have for Orlando in 1 year APM (P=Playoff, RS=Regular Season):
Code:

Player Min P+/- SE RS+/- SE Delta
Lewis 986 9.7 4.2 10.2 4.8 -0.5
Turkoglu 934 3.5 3.9 3.2 4.5 0.3
Howard 903 -0.3 5.6 1.0 6.1 -1.3
Alston 740 -8.3 4.2 -4.3 4.7 -4.1
Pietrus 618 -1.2 4.0 -4.7 4.9 3.5
C. Lee 550 -1.3 3.9 -1.2 4.6 -0.2
Redick 327 -3.3 4.3 -1.3 5.0 -2.0
Johnson 280 -9.3 5.0 -3.1 5.8 -6.2
Gortat 271 -1.0 6.0 -8.1 6.7 7.1
Battie 128 -2.8 4.5 0.7 4.9 -3.5
Nelson 90 -1.0 5.7 6.7 6.3 -7.6

As I understand this, the Playoff value is calculated from the same regular season data, plus doubly weighted playoff data. So based on their regular season data, Orlando as a team were +8, but when you include 2* the Playoff data they are about 4 points Worse as a team? for the team that managed to make the NBA finals?

I am sure I am not understanding something, but can't help but feel that it is due to the fact that they had 18 games against the three top teams (I readily conceed that they didn't spank Philly in first round when they should've, and Boston was playing without Garnett).

Please help me understand.
Back to top
_________________________________________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Thu Aug 13, 2009 10:01 am Post subject: Reply with quote
DJE,

Perhaps an apt analogy will help make things clearer . . .

Think about how team "power ratings" (e.g., Sagarin) are generated: in essence, such ratings derive an estimate of each team's "strength" based on its performance (scoring margin) vis-a-vis each opponent - adjusting statistically for the strength of each opponent.

Thus, if Team A has a power rating of 90 and Team B has a rating of 100 (i.e., Team B would be expected to beat Team A by 10 pts on a neutral court), and Team A then loses to Team B by only 2 points, this better-than-expected performance by Team A will actually increase the team's overall rating, despite the fact that Team A lost.

Similar considerations apply to APM estimates . . .
Back to top
_________________________________________________________________________________________
DJE09

Joined: 05 May 2009
Posts: 148

PostPosted: Thu Aug 13, 2009 7:23 pm Post subject: Reply with quote
I get how you can lose and improve your rating, what I don't get is how strength is generated.

I thought you were using a weighted regression model over all instances of a player appearing in a 10-man unit. Once the model has run, the B_x represents Player X's contribution to the team's winning margin. Perhaps that is a big over simplification at I am missing something important.

I guess what I am thinking is that when we incorporate playoff data, and we re-run the model, we seem to get different values for the player's APM for the players who played in the playoffs, but not for the other players in the league who didn't?

So we seem to have re-evaluated the relative strength of Orlando based on their playoff perfomance, and we have excellent opportunity to evaluate them against top opposition (9-9 against top-3). Given the 'net' winning margin of ORL was less, I would expect the 'best estimate' to decrease, BUT we haven't had the same opportunity to further evaluate a team like, say the Nets, so their 'best estimate' stays the same ie. 0 contribution from playoffs, or a net gain over teams that exhibit a higher loosing margin than expected in the playoffs.

Concrete question: Does Kevin Garnett's APM change when we include Boston's playoff data?
Back to top
_________________________________________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Thu Aug 13, 2009 9:10 pm Post subject: Reply with quote
DJE09 wrote:
I get how you can lose and improve your rating, what I don't get is how strength is generated.

I thought you were using a weighted regression model over all instances of a player appearing in a 10-man unit. Once the model has run, the B_x represents Player X's contribution to the team's winning margin. Perhaps that is a big over simplification at I am missing something important.

I guess what I am thinking is that when we incorporate playoff data, and we re-run the model, we seem to get different values for the player's APM for the players who played in the playoffs, but not for the other players in the league who didn't?

So we seem to have re-evaluated the relative strength of Orlando based on their playoff perfomance, and we have excellent opportunity to evaluate them against top opposition (9-9 against top-3). Given the 'net' winning margin of ORL was less, I would expect the 'best estimate' to decrease, BUT we haven't had the same opportunity to further evaluate a team like, say the Nets, so their 'best estimate' stays the same ie. 0 contribution from playoffs, or a net gain over teams that exhibit a higher loosing margin than expected in the playoffs.

Concrete question: Does Kevin Garnett's APM change when we include Boston's playoff data?

Every player's APM values will change somewhat by virtue of incorporating playoff data into the model (whether or not any given player actually appeared in the playoffs), inasmuch as the playoffs simply represent additional data points (lineups) that help the model reduce noise (i.e., lowering estimation error) by further disentangling the effects of heavily intercorrelated players - a process that affects every parameter in the model.

Does that help?
Back to top
_________________________________________________________________________________________
DJE09

Joined: 05 May 2009
Posts: 148

PostPosted: Thu Aug 13, 2009 10:24 pm Post subject: Reply with quote
Don't teams go to more concentrated lineups in the playoffs, so the player intercorrelations are more strongtly represented?

I think I understand what you are saying is the APM calculations are done for the whole season, and playoff games, OK.

Now I look at the Boston page I can see a recalculated APM post playoffs, where KG's APM changes (increases by 2!) based on 0 playoff minutes - but I can't just go and see the GSW APM post-playoffs (presumably since they weren't in playoffs) ... I guess this is what has made me assume the the recalculation was only done for playoff teams.

So I need to interpret the changes as a "re-evaluation" of the previously noisy figure in light of new evidence. For some of the players, eg. Ray Allen, Peja (negative), Chuck Hayes and Carl Landry (positive) the magnitude of the change (Poff - RS) is greater than the standard error.

What should I understand that to show (I want to say something about role in team, but I'm not sure I understand it properly yet).

Given Playoffs are such a small sample compared to the regular season, about 80 games, I am suprised at how much it reduces the SE.
Back to top
_________________________________________________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Mon Sep 14, 2009 11:56 pm Post subject: Re: 2008-2009 Adjusted Plus-Minus Ratings (Low-Noise Version Reply with quote
Ilardi wrote:

Specifically, I've given each prior season a fractional weighting of the form:

weight = 1/(2^(YearsAgo +1)

This generates the following season-by-season weighting scheme:

2008-2009 = 1
2007-2008 = 1/4
2006-2007 = 1/8
2005-2006 = 1/16
2004-2005 = 1/32
2003-2004 = 1/64

Note that the resulting model still accords nearly 70% of the overall weight to the 2008-2009 season, with much of the rest of the weight coming from the preceding season (and all weightings tapering off exponentially as a function of time).

Choices have to made and I am not sure any is the perfect one but this weighting translates to

2008-9 67.4%
2007-8 16.8%
2006-7 8.4%
2005-6 4.2%
2004-5 2.1%
2003-4 1.1%

I like it pretty well but in another thread I suggested perhaps at least considering a blend of 2/3rds 1 year stabilized and 1/3 6 year average.

That would translate to

2008-9 50.5%
2007-8 16.8%
2006-7 11.2%
2005-6 8.4%
2004-5 7.0%
2003-4 6.3%

Which is "better" or preferable?

Probably a matter of taste.

The blend isn't that different but it does put year 4, 5 and 6 on closer to the same footing and much closer to year 3. That makes some sense to me.

I'll check it the actual blended results later
but if you have any feedback I'd be interested in hearing it Steve.

Even halfway between my proposed blend and your original weight set might be worth considering if you feel mine goes too far away from year 1.

2008-9 58.9%
2007-8 16.8%
2006-7 9.8%
2005-6 6.3%
2004-5 4.5%
2003-4 3.7%

It could an alternate roll-up column on your spreadsheet. Or I can just add it my copy if I am the only one interested in it.
Back to top
_________________________________________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Tue Sep 15, 2009 8:42 am Post subject: Reply with quote
It is not too hard to create a "correct" weighting for best estimate of true talent level at the end of last season. The primary issue is that the "correct" weighting would vary for each player, depending on the minutes played. See Toward and Adjusted +/- Projection System; the forth post there details the Bayesian statistics necessary to generate the correct values.

Essentially, if one knows the variances for each player-year (based on minutes played, or more accurately possessions played), and the transformation curve from one year to the next is known (based on propagation of uncertainty--additive form) then calculating the precise weights is not very hard.

I intend to do that fairly soon.
Back to top
_________________________________________________________________________________________
Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Tue Sep 15, 2009 9:05 am Post subject: Reply with quote
DSMok1 wrote:
It is not too hard to create a "correct" weighting for best estimate of true talent level at the end of last season. The primary issue is that the "correct" weighting would vary for each player, depending on the minutes played. See Toward and Adjusted +/- Projection System; the forth post there details the Bayesian statistics necessary to generate the correct values.

Essentially, if one knows the variances for each player-year (based on minutes played, or more accurately possessions played), and the transformation curve from one year to the next is known (based on propagation of uncertainty--additive form) then calculating the precise weights is not very hard.

I intend to do that fairly soon.

What makes this tricky is the fact that 'weight' within my APM model is assigned to each lineup as a complete unit: i.e., all 10 players on the court receive the same weight based on the number of possessions each given lineup appears for. I know of no way to parse apart a lineup within the model to assign differential weighting to each individual player.
Back to top
_________________________________________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Tue Sep 15, 2009 9:28 am Post subject: Reply with quote
Ilardi wrote:

What makes this tricky is the fact that 'weight' within my APM model is assigned to each lineup as a complete unit: i.e., all 10 players on the court receive the same weight based on the number of possessions each given lineup appears for. I know of no way to parse apart a lineup within the model to assign differential weighting to each individual player.

Ah yes, good point.

I was thinking in terms of Bayesian Statistics, where we use a "transformation" function to bring data up to the current year (age adjustment or just propagation of error), while your model combines all years directly. It is weighted by the number of possessions played each year... I think that I could derive an appropriate weighting per year.

Time to break out MathCAD...
Back to top
_________________________________________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Tue Sep 15, 2009 11:30 am Post subject: Reply with quote
Ilardi wrote:
DSMok1 wrote:
It is not too hard to create a "correct" weighting for best estimate of true talent level at the end of last season. The primary issue is that the "correct" weighting would vary for each player, depending on the minutes played. See Toward and Adjusted +/- Projection System; the forth post there details the Bayesian statistics necessary to generate the correct values.

Essentially, if one knows the variances for each player-year (based on minutes played, or more accurately possessions played), and the transformation curve from one year to the next is known (based on propagation of uncertainty--additive form) then calculating the precise weights is not very hard.

I intend to do that fairly soon.

What makes this tricky is the fact that 'weight' within my APM model is assigned to each lineup as a complete unit: i.e., all 10 players on the court receive the same weight based on the number of possessions each given lineup appears for. I know of no way to parse apart a lineup within the model to assign differential weighting to each individual player.

Okay, Steve, here we go:

I don't have the full adjusted +/- 1 year ratings, but here is the process and my approximate results.

First, an approximate distribution of NBA players is generated (average, stdev) for their 1 year APM ratings. (I got 0, stdev ~ 5.6)

Next, each year's data is regressed to the mean according to the standard error of the player (using Bayesian system--see my other thread for the formulas).

Using player pairs from year to year, each player's change (+ or -) was calculated, using the regressed numbers to damp the error.

Using players with low standard error for both the year and year+1, a normal distribution curve of year-to-year change was created. This is the key to the model. I used (Year+1)-(Year), or an additive change, rather than multiplicative, since this distribution is centered around (and our data points are as well, which would yield spurious results with multiplicative change). The result: Year-to-year change ~0, standard deviation of change = 3.4. This means that 68% of players changed between -3.4 and 3.4 from one year to the next.

Next, I calculated an approximate weighted average for the average player's 1 year APM standard error: 4.4

Finally, I used the Propagation of Error theorems to calculate the average player's standard error for Year, Year-1, Year-2, etc. The result: 4.4, 5.56, 6.5, 7.4, etc....

The weighting is equal to 1/(stderr)^2. I then normalized that so Year=1. The weighting becomes:
1.000
0.626
0.456
0.358
0.295
0.251
0.218
0.193
0.173
0.157

I didn't run the actual derivation (it's recursive and I'm rusty), but an approximate best-fit line is:
Code:

1.156
------------------- - .156
(YearsAgo+1)^.578

A note: this derivation did not account for aging effects. However, the spread in it's YtY transformation did implicitly...

Comments?
Back to top
_________________________________________________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Tue Sep 15, 2009 12:02 pm Post subject: Reply with quote
If I understand it right, if you used a 6 year data set these weights would work out as

33.5% say 2008-9
21.0% then 2007-8
15.3% 2006-7
12.0% 2005-6
9.9% 2004-5
8.4% 2003-4

Different from what I first suggested above, which neither of you choose to respond to. Would have appreciated some feedback then or now, if you wish.

But if you flip to 1/3rd 1 stabilized and 2/3rds 6 year average you get

33.6%
16.7%
13.9%
12.5%
11.8%
11.5%

That would be pretty darn close to the set above it. And at least provides one way of understanding DSMok1's weights in relation to what Steve had provided in his 2 previous sets.

Last edited by Crow on Wed Sep 16, 2009 2:26 am; edited 1 time in total
Back to top
_________________________________________________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Tue Sep 15, 2009 12:20 pm Post subject: Reply with quote
Crow wrote:
If I understand it right, if you used a 6 year data set these weights would work out as

33.5%
21.0%
15.3%
12.0%
9.9%
8.4%

Different from what I first suggested above, which neither of you choose to respond to. Would have appreciated some feedback then or now, if you wish.

But if you flip to 1/3rd 1 stabilized and 2/3rds 6 year average you get

33.6%
16.7%
13.9%
12.5%
11.8%
11.5%

That would be pretty darn close to the set above it. And at least provides one way of understanding DSMok1's weights in relation to what Steve had provided in his 2 previous sets.

Yep, that looks right. I will note, though, that my numbers were approximate because I didn't compile the full APM 1-year for the last few years to do it completely.

I still wish there were a way to include aging effects explicitly in the model... that looks difficult.

Hmmmm..... If I know the aging curve, and I know the weights accorded to each year, and I know the minutes each player played in each year, I think an adjustment factor could be applied! I must ponder.
Back to top
_________________________________________________________________________________________

DSMok1 · Post by **DSMok1** » Wed May 18, 2011 9:18 pm

Recovered page 3 of 3

Ilardi

Joined: 15 May 2008
Posts: 265
Location: Lawrence, KS

PostPosted: Tue Sep 15, 2009 4:30 pm Post subject: Reply with quote
Crow wrote:
If I understand it right, if you used a 6 year data set these weights would work out as

33.5%
21.0%
15.3%
12.0%
9.9%
8.4%

Different from what I first suggested above, which neither of you choose to respond to. Would have appreciated some feedback then or now, if you wish.

But if you flip to 1/3rd 1 stabilized and 2/3rds 6 year average you get

33.6%
16.7%
13.9%
12.5%
11.8%
11.5%

That would be pretty darn close to the set above it. And at least provides one way of understanding DSMok1's weights in relation to what Steve had provided in his 2 previous sets.

Hey Crow: I think your suggested weighting scheme has merit, but I'm particularly interested in trying to increase as much as possible the weight accorded to the target season. My current set of weightings has the current season at just under 68%, whereas yours has it just over 50%; if anything, I'd like to be going in the other direction.

On the other hand, DSM's Bayesian approach to the weighting issue is intriguing - I love the fact that it's actually non-arbitrary! - and it accords much heavier weightings than I do to more recent seasons . . .
Back to top
_______________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Wed Sep 16, 2009 2:14 am Post subject: Reply with quote
Thanks for both replies. It helps.

I hear your preference Steve for more weight in the most recent season. My initial preference was in line with that. But I wonder how well 1 year stabilized can correct for an abnormal up or down year in the target year and so moved to some consideration of blends.

Now DSMok1's weights give less in the most recent season than yours and more weight to the most distant past seasons than yours. I didn't include the years on my list of his rankings and probably should have.

You said: "On the other hand, DSM's Bayesian approach to the weighting issue is intriguing - I love the fact that it's actually non-arbitrary! - and it accords much heavier weightings than I do to more recent seasons."

Was that just a slip or am I misunderstanding?

I'll note that my original idea of 2/3 1 year stabilized and 1/3 third 6 year adjusted was just about exactly in the middle between Steve's 1 year stabilized weights and DSMok1's weights at least for the most recent years.
Back to top
_______________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Wed Sep 16, 2009 8:10 am Post subject: Reply with quote
Crow wrote:
You said: "On the other hand, DSM's Bayesian approach to the weighting issue is intriguing - I love the fact that it's actually non-arbitrary! - and it accords much heavier weightings than I do to more recent seasons."

Was that just a slip or am I misunderstanding?

He meant more weight to the most recent seasons prior to the current season.

Once I compile a few seasons of 1-year APM (or does someone already have them available?) I'll be able to do the math more exactly.
Back to top
_______________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Wed Sep 16, 2009 10:08 am Post subject: Reply with quote
Ilardi wrote:

On the other hand, DSM's Bayesian approach to the weighting issue is intriguing - I love the fact that it's actually non-arbitrary! - and it accords much heavier weightings than I do to more recent seasons . . .

You know, I think it is possible to incorporate aging effects into your model.

You're using this formulation, approximately, right?
MARGIN = b0 + b1X1 + b2X2 + . . . + bKXK + e
(That's what Rosenbaum originally stated)

To add an approximate aging adjustment (which would decrease the error of the model actually, replace b1 (etc.) with (b1+a1), where a1 is the age-adjustment additive value (see http://sonicscentral.com/apbrmetrics/vi ... =1359#1359).

This would allow greater weights to be used for the "older" data, because the YtY transformation would have a narrower spread. That said, I don't have the data available to verify the aging curve, using the latest methods (for instance, always dropping each player's last year--and perhaps regressing the data beforehand to account for the error of the 1-year APM).
Back to top
_______________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Wed Sep 16, 2009 1:14 pm Post subject: Reply with quote
DSMok1 wrote:

He meant more weight to the most recent seasons prior to the current season.

Once I compile a few seasons of 1-year APM (or does someone already have them available?) I'll be able to do the math more exactly.

Ok, your explanation in the first sentence makes sense of Steve's statement for me now.

On the second, I assume you mean already compiled in one spreadsheet. Steve suggests he'll provide in the future newly calculated low-noise 1 year stabilized. A spreadsheet of pure 1 year estimates, albeit from different authors over time would also be useful. If you compile it, that would be good to post a link to.

P.S. Will you be looking at player pair adjusted at all, using it in your model or releasing it separately?

Does a blend of statistical +/- and adjusted +/=, after each is calculated or as part of the same jointly produced model make sense to you or have your interest, or do you think it is contradictory or unnecessary or too complex?
Back to top
_______________________________________________________
DSMok1

Joined: 05 Aug 2009
Posts: 611
Location: Where the wind comes sweeping down the plains

PostPosted: Wed Sep 16, 2009 1:35 pm Post subject: Reply with quote
Crow wrote:

On the second, I assume you mean already compiled in one spreadsheet. Steve suggests he'll provide in the future newly calculated low-noise 1 year stabilized. A spreadsheet of pure 1 year estimates, albeit from different authors over time would also be useful. If you compile it, that would be good to post a link to.

P.S. Will you be looking at player pair adjusted at all, using it in your model or releasing it separately?

I mean anywhere. I haven't seen pure 1-year estimates compiled anywhere but Basketball Value, and I'm not totally sure how to interpret the "playoff" vs. "non playoff" stuff there (because I thought adding more games for the playoff teams would change the ratings for the non-playoff teams also). David Lewin's results on 82Games.com do not include the requisite standard error terms.

Player pairs would add more complexity than they would bring accuracy, I suspect.
Back to top
_______________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Wed Sep 16, 2009 2:01 pm Post subject: Reply with quote
DSMok1 wrote:

I mean anywhere. I haven't seen pure 1-year estimates compiled anywhere but Basketball Value, and I'm not totally sure how to interpret the "playoff" vs. "non playoff" stuff there (because I thought adding more games for the playoff teams would change the ratings for the non-playoff teams also). David Lewin's results on 82Games.com do not include the requisite standard error terms.

Player pairs would add more complexity than they would bring accuracy, I suspect.

Good points on the first part.

On the second, adjusted player pair would add more complexity and might or might not go in the model but it is a complex story and I think they deserve to be seen and thought about at some point.
Back to top
_______________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Thu Sep 17, 2009 3:00 pm Post subject: Reply with quote
Wayne Winston confirmed he produces adjusted player pairs for Dallas.

http://tinyurl.com/lo68h3
Back to top
_______________________________________________________
John Hollinger

Joined: 14 Feb 2005
Posts: 175

PostPosted: Fri Sep 18, 2009 10:42 pm Post subject: Reply with quote
Interestingly, he had me rating Telfair much higher than I actually did. His PER was only 10.86 last year.

More bizarrely, the top three APM players on last year's T'wolves were Telfair, Mark Madsen and Brian Cardinal. Not sure where to start with that one ...
Back to top
_______________________________________________________
Crow

Joined: 20 Jan 2009
Posts: 825

PostPosted: Sat Sep 19, 2009 12:18 am Post subject: Reply with quote
The APM I see from Aaron and from Steve has Telfair, Cardinal, Foye and Jefferson as the positive guys last season and Madsen with too few minutes to get a ranking.

If Winston has Foye and Jefferson lower and Madsen rated that is a bit different. Does he give his error estimates?

Steve's 1 year stabilized adjusted has Telfair doing some good on offense at +3 (with his more frequent, average 3 pt shot) but it also has him almost neutral on defense at -1. I assume a big change from the old days. Not sure which side of the ball or both that causes Aaron's 1 year pure adjusted to be much more favorable +8 but from the raw +/- I assume it is mostly or all offense.

Different APMs. different stories or different parts of the overall story.
Back to top
_______________________________________________________

Crow · Post by **Crow** » Wed May 18, 2011 10:35 pm

Thanks for the recovery DSMok1.

I had gotten the one Steve started the day before this one,
http://sonicscentral.com/apbrmetrics/vi ... p?f=2&t=24
which used equal year weights and didn't notice that there was another with a similar title but variable time weights.