Page 2 of 2
Re: Will Success Come From Better Data?
Posted: Fri Sep 16, 2011 5:19 pm
by Mike G
Schtevie, did you see my previous post? It may have come in while yours was in the pipeline.
In the meantime, I just downloaded 2011 Celtics' lineup data from bv.com and sorted by number of starters in the lineup; then totaled possessions and point differential to get unadjusted +/- under each condition.
I considered just 4 Celts to be starters, as apparently the next best available body is interchangeable from a plus-minus value: O'Neal, Davis, Green, Krstic, Perkins, or Erden.
Code: Select all
St Min Poss Sco UnAdj
4 1524 2914 415 14.2
3 851 1599 147 9.2
2 584 1084 -1 -0.1
1 592 1073 -61 -5.7
0 404 768 -60 -7.8
T 3956 7436 440 5.9
The biggest dropoff this year was from 3 of the big 4, to just 2. From 4 to 3, not so much.
I did wonder whether it matters whether starters (Rondo, Allen, Pierce, KG at either PF or C) are in their normal positions. So I've fine tuned the ordering, adding 0.1 for each starter who is in his regular spot.
If 3 starters are in, but only 2 are in their normal position, that's a 3.2, for example.
Code: Select all
Sta Min Poss Sco UnAdj
4.4 1496 2846 405 14.2
3.3 752 1402 121 8.6
3.2 88 171 11 6.5
2.2 471 878 22 2.5
2.1 113 205 -22 -10.7
1.1 535 970 -71 -7.3
1.0 57 104 10 9.7
0.0 404 768 -60 -7.8
Tot 3916 7342 416 5.7
Lineups such as 4.3, 3.1, 2.0 are all insignificant minutes.
Some evidence that it's worse when someone is playing out of position?
Garnett totaled 1.4 minute without another starter (outscored by 3 pts).
Re: Will Success Come From Better Data?
Posted: Fri Sep 16, 2011 7:07 pm
by schtevie
According to a naive calculation based on the 2011 numbers you cite, I should probably revise my views.
Assuming that the "4 starters" are disembodied and homogeneous, I calculate that the Celtics perfectly arbitraged their substitutions! That is, from the "3, 2, and 1" starter line-ups possession total, we could fabricate 1607 possessions of "4" lineups. This would be about 35% of the total "non 4" line-ups listed. And it so happens that 35% of a notional "4" differential of +14.2 exactly offsets the implied 65% of the "0" differential of -7.8.
That said, I have a bit of trouble with the set-up.
First, the concept of only 4 starters misses the point. And 2011 was a weird year, for the injuries and trades. The issue is whether starting lineup minutes (the specific "5") are able to be profitably extended at the expense of "starter dilution" ("4, 3, 2, 1, and 0") and for this, you need a record of the relevant alternatives. If we could break out Shaq's 37 games, and Perk's 12, and so on, we could tell a more coherent story.
As all we have are season aggregates, better to look at 2008 through 2010, I should think (and probably not even 2009, when KG missed quite a few games).
Re: Will Success Come From Better Data?
Posted: Fri Sep 16, 2011 7:42 pm
by Crow
Celtics had better performing bench guys on individual APM and lineups that were heavy with them in 2009-10 than in 2010-11.
Conversely the lineups with pretty high levels of starters did better in 2010-11. Players pretty much delivered their APM impact linearly and performance went down as expected as starters decreased.
The different performance trends illustrate the need for caution to resist drawing too strong a conclusion from one season or assuming it will carryover to another. More study (of one team and the league in general and thru time) would be needed to strengthen impressions and instincts on these issues.
Two bench guys were the same, a few different.
Traditional APM ratings for Rondo, Allen, Pierce and Garnett swung dramatically from 2009-10 to 2010-11 but not that much on RAPM.
There appears to be more difference between raw +/- and APM estimates for top lineups in 2009-10 than in 2010-11.
If changes of one or a few guys for others can set off major impacts / ripple effects you can either intensify the analysis or simplify & assume (and perhaps put the analysis on the inconclusive shelf) and trust coaching instincts. I'd rather see what conclusions & recommendations each approach (based on skill & experience) offer and see how they compare and then try to sort or balance them out, doing additional research & thinking as warranted & possible.
Re: Will Success Come From Better Data?
Posted: Fri Sep 16, 2011 8:52 pm
by Mike G
schtevie wrote:... the concept of only 4 starters misses the point. And 2011 was a weird year, for the injuries and trades. The issue is whether starting lineup minutes (the specific "5") are able to be profitably extended at the expense of "starter dilution" ("4, 3, 2, 1, and 0") and for this, you need a record of the relevant alternatives. If we could break out Shaq's 37 games, and Perk's 12, and so on, we could tell a more coherent story. ..
Shaq's the nominal 5th starter, but for fewer than half of all games.
Others' minutes X plus-minus wouldn't seem to make much of a dent
alongside the Big 4:
Code: Select all
5th player pos St/G Min Poss Sco UnAdj Adj notes
Krstic, Nenad C 20/24 286 549 48 10.4 6.1
O'Neal, Shaquille C 36/37 266 514 101 18.8 18.8
Perkins, Kendrick C 7/12 170 317 44 13.5 12.6
O'Neal, Jermaine C 10/24 127 235 12 5.5 6.8
Erden, Semih C 7/37 79 154 9 7.3 1.0
Davis, Glen PF 13/78 515 978 170 17.0 16.2 KG to C
Green, Jeff PF 2/26 53 95 21 9.2 8.6 KG to C
Daniels, Marquis SF 0/49 12 28 -6 2.3 N/A + Pierce PF
Robinson, Nate SG 11/55 13 31 15 -4.3 N/A + Ray to SF
Three centers started more than half 'their' games, so take your pick.
My conclusion was that only 4 players are indispensable here, and it's a pretty unique situation: 4 of the all-time best and most proto-typical of their positions.
Perkins looked like he mighta been the 5th wheel, but he looked pretty washed-up in Okla.
Re: Will Success Come From Better Data?
Posted: Sat Sep 17, 2011 4:57 pm
by Crow
The Center position is problematic for this test with the Thunder data in 2010-11 as it is for a lot of teams.
Following what Mike G did with the Celtics in 2010-11 for simplicity, I looked at subsets of lineups with 4, 3 or 2 or less of Westbrook, Harden, Durant and Collison (at PF or C). I went back to Westbrook over Maynor for the moment despite Maynor's win on individual APM this season.
When all 4 were in with Westbrook being called one of them (and when that lineup played 25+ minutes) they did great: an average of +13.9 per 100 possessions but they only played 670 possessions or about 300 minutes. Just 3 of the best 4 the performance fell very sharply to just +2.1. But again, similar to the case of the 2009-10 Celtics, when just a few were in (2 or 1, no lineup used 25+ minutes had zero of these top 4 guys) the performance rocketed back up to +9.6 per 100 possessions. Lot of bench vs probably lot of bench won big as did lot of starter vs probably lot of starters.
Who dictates on these details game to game, who changes and who stays true to their plan (if there is a coherent / consistent plan, effective on average or not) and who actually wins, especially in playoff series would be an avenue for further analysis.
The Thunder, at least that the 10-11 season, appear to have had the chance to maximize their total net edge if they had emphasized having all 4 of these top guys (with Westbrook) on the court far more than they did, regardless of the resulting reduction in the amount of time they could have 3 on the court. The apparent team thought that 3 on the court longer would be a good or even wiser strategy than 4 a lot and less than 3 a lot is not supported at the level of this data.
But if you go back and allow Maynor to take his place with the others who had the best individual APM in 2010-11 things change somewhat. Having all 4 of Maynor, Harden, Durant and Collison did fine +10.5 per 100 possessions on raw +/-. It was lightly used (under 200 minutes) because Durant usually sat for part of the time these played together, especially when the other 3 were all out there. But with Maynor out there instead of Westbrook having just 3 of the 4 was no longer a weak performer, in fact it did better than when all 4 were out there, scoring a +14.5 rating. Fortunately the Thunder used just 3 instead of 4 of the top 4 guys, four times as much as all 4 guys. Just 2 or less of the 4 dropped off to +1.3 but regrettably that type of lineup was used three times more than just 3 of the 5 top guys.
There would appear to be plenty of room to optimize giving time with Westbrook to lineups far more often with all 4 of the top guys (when you give him that status) over just 3 of them and maximum time with Maynor to 3 or 4 of the 4 top guys and not try to stretch with just 2 of the 4 for a lot of the time.
Another illustration that the choice whether to maximize all your best players on the court together or trying to spread them out further by maximizing the time with one less than the max number on the court depends on data analysis of that specific team in a specific season and perhaps with a specific player vs another (at PG or C or other or a combination if you want to take it further).
What the optimal configuration for the Thunder was overall within actual player time or feasible player time would take more calculation. It appears easier though, in this season, to get really strong team level performance with Maynor than Westbrook. 3 or 4 of the top guys did this trick with Maynor whereas it took all 4 with Westbrook and that is more "expensive" in terms of using up top player time. On the surface it would appear that the optimal lineup strategy would be to put Maynor out there with 3 of the top 4 the most (avoiding having 2 with him), then Maynor or Westbrook out with 4 top guys the next most (avoiding Westbrook with 3 with him), especially if you think you need more top guys out there for matchup reasons against tough opponents.
Of course there is still further analysis and evaluation still be done before one actually finalized the complete lineup plan. You'd want to look at the Adjusted +/- lineup data (even with the estimated error) to compare it to the raw +/- data, probably the 4 factors (and maybe other discrete stats) and you'd want to look at the performance of individual lineups within the subsets with a certain number of top guys and optimize the best and minimize or eliminate the worst.
I didn't study the lineups used less than 25 minutes in similar detail as lineups with over 25 minutes use but they they broke slightly better than neutral overall on raw +/- and that is helpful to the overall optimization effort. It appears that perhaps about 20% of total time available from the top 4 guys (with Westbrook inserted) was used to shore up these dink minute lineups to a neutral rating when about 25% of the total time available from the top 4 (with Maynor) was used to do the same.
Maynor and 3 other top guys with Westbrook allowed to count as one at SG was rarely tried and from the early data it looks probably wise not to do that much. But whether they tested it enough or should test it more is an question I'll leave open. Reggie Jackson may affect the actual lineup rotation and / or the optimal lineup rotation next season. Perkins vs other centers will be something to watch further as well.
Re: Will Success Come From Better Data?
Posted: Wed Sep 28, 2011 12:38 pm
by schtevie
OK, some evidence of low-hanging fruit left to rot. Consider the Celtics in their championship year (2007-08) and the counterfactual benefits of "platooning" the starting lineup. Offensive Ratings for:
Overall Team: 11.28
Starting Line-up: 19.47
One to Four Starters: 9.71
All Bench: -1.99
What would the notional benefits have been to playing the Starting line-up "maximum" minutes, assuming that the Starting and All Bench line-up productivities are fixed at the values above. Taking the starter who played the fewest possessions, Kendrick Perkins, and assuming these all were with the starting unit (and subtracting the possessions from his contribution to the One or More Starter category and assigning the difference to All Bench) you get the following modifications.
Overall Team: 13.66
Starting Line-up: 19.47
One to Four Starters: 10.43
All Bench: -1.99
This is an overall improvement in OR of 2.38 - a non-trivial increment, equivalent in a Pythagorean World to 70.8 wins (over the actual 66). Is it reasonable to believe that such opportunities were not just apparent but real? Again, I think it is. But the idea of dedicating the starting center to only play with the starting line-up? Too crazy to happen.
And finally, we can pursue an additional counterfactual. Perks' minutes were primarily (I think - someone correct me if I'm wrong) and significantly limited by "foul trouble". Let's now suppose that Doc somehow had also found a way to ignore the conventional wisdom that it is always a good idea to bench productive assets that have accumulated some notional threshold of early fouls. If so, it is not unreasonable to imagine that Perk could have been on the court 30% more - what would imply only about 32 minutes and 4 fouls per game. Fair?
The results, along the same line as before, are a Team OR of 14.97 - an even less trivial increment of 3.69 on the actual results, corresponding to 72.5 Pythagorean Wins.
Returning to the larger point, if apparent opportunities for first-order improvements using very conventional data are not taken advantage of, for all the obvious institutional reasons, why believe that benefits from more high-falutin' approaches would accrue? Whither Moneyball?
Re: Will Success Come From Better Data?
Posted: Wed Sep 28, 2011 5:18 pm
by Crow
In 2007-8 Perkins played with each of Allen, Pierce and Rondo for about 95% of his time on the court in the regular season but with Garnett only about 60% of the time. Rondo-Allen-Pierce-Garnett-Perkins all out there together accounted for about 56% of Perkins' time. In the playoffs Doc did increase playing of Garnett-Perkins to about 90% of Perkins' time and Rondo-Allen-Pierce-Garnett-Perkins together to about 70% of Perkins' time. So some more of the available opportunity for playing this way was taken advantage of in the playoffs but additional opportunity was left on the table. Part of the reason may be that it performed at only about half the efficiency (in raw and Adjusted +/- terms) as it did in the regular season. In the playoffs the other 4 starters with PJ Brown or Posey performed better on both measures and with Davis it was almost equal on Adjusted but vastly better on raw +/-.
There can be some reasons or excuses for not pursuing the maximum efficiency in the regular season, but close to none in the playoffs (besides resting stars in a huge blowout win or loss to prevent injury and get them a little bit more rest).
Re: Will Success Come From Better Data?
Posted: Wed Sep 28, 2011 7:16 pm
by Crow
Looking at 8 big minute lineups of contenders for this past season who played a lot in both regular season and playoffs, the average dropoff in performance per 100 possessions from regular season to the playoffs was -6.6 on raw +/- and -8.6 on Adjusted +/- estimates. Every single one of these lineups dropped noticeably, with the Lakers and Spurs dropping the least and the Thunder dropping the most. The performance decline of these top lineups in the playoffs is mostly accounted for by the higher average quality of opponents faced in the playoffs (and increasingly so as you moved from round to round) compared to the regular season.
Only 11 lineups got used 100+ minutes in the playoffs. Arguably every playoff team could possibly have had one if they really pushed for it. 7 of the 16 actually had one. The 4 conference finalists, with of course far more opportunity to rack up minutes, had 8 of them (6 positive on raw +/-, 2 negative) and the other 3 were held by teams losing in the 2nd round (2 positive on raw +/-, 1 negative). Dallas had 2 of the 4 best performers, Miami 1, Boston 1. OKC the worst performer.
Using a fair per game basis it appears only 9 playoff teams had a lineup used over 10 minutes per game. Boston had the highest usage at close to 20 minutes per game, actually higher in minutes per game than the 3 previous playoffs. The Lakers were in the 10-14 minute range the 3 previous playoffs but went up to almost 16 minutes per game. Not enough minutes but mostly not effective enough against Dallas (-17 for this lineup in about 51 minutes, -56 overall).
A lot can depend on one lineup. I discussed previously how important the one super Dallas lineup was to their playoff success. For the entire playoffs Miami's total point differential was 56. One lineup produced 42 points of it, or 75%. The Celtics were only +12 in this playoff run but were +66 with Jermaine O'Neill and the other starters. It was used almost 20 minutes per game or about 90% of his minutes. Not sure how if he could have played more minutes. Unfortunately it was pretty much the only lineup that worked on raw +/- for more than a few minutes and a lot of the next most used were terrible. Without that 1 good lineup they might have been swept in the first round.
Both Dallas and Miami changed their top lineups from regular season to playoffs a lot (more than most) and both with mixed success, but Dallas found a lineup more than 50% better on raw +/- and more than twice as good on Adjusted +/- than Miami's and they played it more than 50% more minutes in the same amount of games. Dallas got 115 points of edge from that super lineup, while Miami's gave them 42 points edge. Dallas' super lineup then beat Miami overall by 48 points in the series. They won the Finals by 13 so apart from that one super lineup Dallas lost by 35 or almost 6 points per game. Miami played their best bigger minute regular season lineup only 21 minute against Dallas or 3.5 minutes per game. They were -1. Nearly all of the loss came against other lineups.
Raw +/- liked Chicago's best, bigger minute lineup in the playoffs but Adjusted +/- were more critical. Hard to be sure what to conclude it this circumstance.
However neither metric liked OKC's most used lineup. In the first 2 rounds it went -8 total but against Dallas it went -27 in about 57 minutes. Stop using a horrendously performing starting lineup in a series? Not the Thunder. Can't sustain that level of loss from a lineup and win it all, this season or in the future. The rest of the Thunder lineups combined to go +7 against Dallas. But the insiders know best right? Does not seem like it to me on that choice.
They did find Westbrook- Harden- Durant- Collison- Perkins in the playoffs after only giving it 22 minutes trial in the regular season. it had an extremely strong playoff run +30 per 100 possession on raw +/ and +13 on Adjusted +/-. 62 minutes use overall, though it was still less than 4 minutes per playoff game (up from 1.3 minutes per game in the regular season after Perkins became available). This lineup was actually +12 against Dallas in 21 minutes. It might be hard to know which lineups to "chase" and which to let go
at times, but, based on the early data, I would have let go of a lot of the starting lineup and chased this alternative a lot more rigorously based on the early games data. It was the only one of the 5 most used lineups against the Mavs that wasn't horrendous and instead it was tremendous. It would be hard for the contrast within this group to have been greater or obvious. There were other top 10 in minute lineups that were great though... but, what do you know, they all had Collison playing. All lineups used over 10 minutes against Dallas that had Perkins without Collison were horrendous. That trend seems pretty clear cut.
http://basketballvalue.com/teamvsteam.p ... 20playoffs
In a short cut series you can wonder when the trend developed and if they had enough information and time to used the lineup more. In this case it appears they used the Collison-Perkins combo about 18 minutes in the first 3 games (6 minutes per game), to great per minute success (their biggest one)... and then used it just 2 minutes in the final 2 games. Taking your best bigger minute lineup down from 6 minutes to 1 minute per game after great success and not much else working is a bit hard to fathom. They like to stick to their standard starting lineup and have for years. That worked well enough for them overall even though they have been better with non-starting lineups than starting lineups for several years. In the end though, I'd score a miss for that judgment (or oversight) in this series.
Will they use Westbrook- Harden- Durant- Collison- Perkins more next season? Something to watch. Based on available data and brief analysis, at this moment I'd lean toward suggesting they "should", at least in the playoffs. Of course continuing to monitor this lineup and all others, especially the most used, using all available tools for analysis.