Does Good Pitching Beat Good Hitting in Bball? (EliW, 2007)

Crow · Post by **Crow** » Thu May 26, 2011 10:03 pm

Recovered fron the files of WiLQ

Eli W

Joined: 01 Feb 2005
Posts: 327

PostPosted: Mon Dec 17, 2007 5:30 pm Post subject: Does Good Pitching Beat Good Hitting in Basketball? Reply with quote
I just put up a long piece on my blog about a technique to measure which aspects of the game are more controlled by the offensive team and which are more controlled by the defensive team. I've been working on this for a while and I think it could be a good starting point to a lot of interesting stuff, so I'd love to hear any feedback that people might have.

http://www.countthebasket.com/blog/2007 ... asketball/
_________________
Eli W. (formerly John Quincy)
CountTheBasket.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Mountain

Joined: 13 Mar 2007
Posts: 355

PostPosted: Mon Dec 17, 2007 6:28 pm Post subject: Reply with quote
I appreciate this work. It provides data on many questions regarding offense / defense contest to control. I will give it another pass before perhaps offering comments or questions.

The volume of work is high and the clarity of presentation is high. I get the impression the quality of the analysis is solid too though perhaps others can ask good questions or offer comments related to the study from a technical standpoint. From an outside perspective / participant here I'd hazard to say that if teams are looking for the next analytical guy to bring into inside articles like these bolster your case.
Back to top
View user's profile Send private message
Brian M

Joined: 25 Nov 2006
Posts: 17

PostPosted: Mon Dec 17, 2007 7:35 pm Post subject: Reply with quote
Interesting idea. A couple of issues to consider:

1) the basic hypothesis is that if the offense controls stat X more than the defense, then the OCR will be higher. Let's abbreviate that

offensive control -> high OCR

Even if this relationship holds, though, it doesn't justify the converse statement,

high OCR -> offensive control

There could be other factors contributing to OCR skewed-ness other than offensive control. It would make the approach tighter if such potential confounds were either explained away (legitimately) or controlled for if they actually exist.

The way one addresses this issue hinges a lot on what one means by "control" though. In particular the word conjures the image of volitional control, teams/players/coaches intentionally affecting the outcome of a particular stat. But for instance, in FT% some of the variance arises from factors a player can control (practice, etc) but some from factors a player can't control (injury, having an off night, etc).

The way the OCR stat is composed it seems to just be telling us about to what extent the variance in a stat can be attributed to the offensive or defensive team. It is agnostic over to what extent the variance is due to player effort, skill, or strategy, vs other factors. So perhaps a word more neutral than "control" would be better, e.g. VAO (variance attributable to the offense) or something catchier.

2) the technique might need to be adjusted if it were to be applied on the team level instead of being collapsed across teams. For instance, imagine we looked at the Spurs' FT%s for and against within a given season. We might find that the FT% OCR is lower than expected, because the std dev of the Spurs' FT% is affected only by within-team variance, whereas the std dev of their opponents' FT% is affected by both within-team variance and between-team variance.

I think the way you crunched the numbers in your piece controls for this by averaging offensive stats across teams as well, introducing between-team variance into the offensive data. But this would not be the case when looking at the OCR of individual teams and so some sort of adjustment would be needed.
Back to top
View user's profile Send private message
Eli W

Joined: 01 Feb 2005
Posts: 327

PostPosted: Mon Dec 17, 2007 11:17 pm Post subject: Reply with quote
I hear you on 1. "Control" is a strong word, and it could be that some high OCR's are caused by something other than the offensive having more control than the defense. I'm not sure what that would be, but it's worth thinking about. You could say that greater variance just means greater variance, but I think that's shortchanging things a little. High OCR might not imply offensive control, but doesn't it imply "good pitching beats good hitting"? That seems like a more substantive conclusion, even if it's not all the way to "control."

You're right about the agnosticism on what "control" is made up of. In part this is because I think it could be very different for different stats. Sometimes it may be related to player skills, and sometimes to team strategies.

As for 2, I'll have to think about that more, especially if I decide to do some follow-up research along those lines.

Thanks for the comments.
_________________
Eli W. (formerly John Quincy)
CountTheBasket.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Kevin Pelton
Site Admin

Joined: 30 Dec 2004
Posts: 707
Location: Seattle

PostPosted: Tue Dec 18, 2007 12:01 am Post subject: Reply with quote
Great work, Eli. This is one of those topics I've always been interested in but to which I have never devoted a full study.

This statement from my column you referenced looks pretty ignorant now: "Year in and year out, the standard deviations for offense are significantly higher than those for defense." It so happened that the years I studied the issue were some of the most offensively "controlled" in recent memory, though to make the claim of "year in and year out" I certainly should have looked at more than three years' worth of data.

Eli W wrote:
Looking at some 82games stats, we can see that the percentages of a team’s dunks and close shots that are assisted are mostly controlled by the offense (OCR’s of 0.60 and 0.62), but the defense seems to have more control over the percent of jumpers that are assisted (OCR of 0.53). I’m not sure how to interpret this.

I think this can be explained by the fact that where defenses really have control over assist rates is with the quality of their rotations. It seems reasonable that this would show up much more in terms of uncontested looks on the perimeter.
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address
HoopStudies

Joined: 30 Dec 2004
Posts: 578
Location: Near Philadelphia, PA

PostPosted: Tue Dec 18, 2007 10:52 am Post subject: Reply with quote
You looked at a lot of things and considered an important topic -- control. Looking at FT% is a good start, too. At first, you may think that it is purely an offensive control because there is no D on a foul shot. So you'd think that your OCR should be very high. But it isn't so far above other things. I think that suggests 2 things. First, it suggests that D does impact FT% a little. It does. Defenses do try to foul worse foul shooters and some are better at it than others. Second, it suggests a flaw in the method, one that I don't know a priori how to fix. But the concept is good and the results are probably solid overall. Small differences probably don't mean much, but big ones probably do.

You did miss what I think is the big thing about the study -- it suggests what stats to measure or put in a boxscore. If this says that blocks are controlled by the defense much more than the offense, then the BA stat showing up in boxscores isn't all that valuable. If this says that non-offensive foul, non-steals are more offensively controlled whereas offensive fouls and steals are D controlled, that suggests a breakdown of individual turnovers to better track those. That's a better use of our boxscore.

Making that kind of difference would be very nice and your work is good enough to tell that story. So I would suggest that you do a follow-up/edit to tell that story of how it suggests what should be in the boxscore.

Nice stuff...
_________________
Dean Oliver
Author, Basketball on Paper
http://www.basketballonpaper.com
Back to top
View user's profile Send private message Visit poster's website
Mountain

Joined: 13 Mar 2007
Posts: 355

PostPosted: Tue Dec 18, 2007 2:12 pm Post subject: Reply with quote
Taking the approach to team level would be very helpful. Comparing team performance to league average or other top teams would give better context on team performance.

With team performance we can see the results. The team knows the points of strategic emphasis and judge those direct results and by-products.

Game to game when variance attributable to offense or defense rises above some level .65 or .70 for a particular stat what is the correlation to winning? Which correlations are highest? Are those the things you should focus on "controlling"?

When one stat is "contolled" on average how do the rest of the stats look? What did you also gain or lose?

There are lots of important ways to use this approach.

And on the side it could be good reference material for Ben F. and his XOHoops simulation.
Back to top
View user's profile Send private message
Eli W

Joined: 01 Feb 2005
Posts: 327

PostPosted: Tue Dec 18, 2007 2:37 pm Post subject: Reply with quote
HoopStudies wrote:
You did miss what I think is the big thing about the study -- it suggests what stats to measure or put in a boxscore. If this says that blocks are controlled by the defense much more than the offense, then the BA stat showing up in boxscores isn't all that valuable. If this says that non-offensive foul, non-steals are more offensively controlled whereas offensive fouls and steals are D controlled, that suggests a breakdown of individual turnovers to better track those. That's a better use of our boxscore.

Yeah, I thought about putting something like that in there. The idea of offensive or defensive control is already recognized implicitly in what stats we choose to record. I'll try to think more about that and do a follow-up post. One thing I'd want to look at first, since we're mainly talking about player boxscore stats, is variance on the player level as opposed to the team level. I think in many cases that will mirror what I found on the team level, but there may be some differences.
_________________
Eli W. (formerly John Quincy)
CountTheBasket.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Guy

Joined: 02 May 2007
Posts: 43

PostPosted: Tue Dec 18, 2007 4:46 pm Post subject: Reply with quote
This is very interesting work, Eli. Team variance can tell us a lot about how much of a skill is really being measured by a statistic.

I would suggest one change that I think will clarify the story these data tell. You should remove the random variance from each of these measured SDs, so that you are measuring the real underlying variance created by the team. Because the sample size, and thus random error, varies for different statistics, this can change things quite a bit. The formula is SD(true) = SQRT (SD(observed)^2 - SD(error)^2), where SD (error) = SQRT (Mean * (1-mean)/N.

Take FT% as an example. I'm guessing your OCR=.69 value means something like SD(o)=.03 and SD(d)=.0135. With an average N of around 2144, the random error SD = sqrt(.75*.25/2144) = .0093. So, TrueSD(o) = sqrt (.03^2 - .0093^2) = .0285. TrueSD(d) would be .0097. That is, if teams could play an infinite number of games, these are the variances we would observe. So your new OCR is .0285/(.0285+.0097) = .75. (I'm not sure if SDO/(SDO+SDD) is the best possible measure of control, but seems reasonable to me.)

Where N is smaller, this approach changes your results more substantially. For example, on 3P% I'm guessing your OCR=.59 means something like SDO=.0177 and SDD=.0123. However, the random error SD for N=1330 would be about .013, about the same as the SDD. (Because team 3PAs vary a lot it's a little more complicated than that, but result should be similar.) In that case (assuming my numbers are about right), teams have NO real ability to affect the 3P% of their opponents -- the variance we see is just noise. So the real OCR is close to 1.

Quote:
One thing I'd want to look at first, since we're mainly talking about player boxscore stats, is variance on the player level as opposed to the team level. I think in many cases that will mirror what I found on the team level, but there may be some differences.

Actually, the story will look very different in some cases. Comparing the player and team variances can shed light on how much an individual player stat translates into actual gains at the team level.

HoopStudies wrote:
You did miss what I think is the big thing about the study -- it suggests what stats to measure or put in a boxscore.

Good point. And I'd go even further: team variance can help settle -- or at least narrow -- some of the debates about player valuation metrics. The amount of team variance tells us how big an impact a factor has on actual wins and losses. Rebounds is a good example: the variance at the team level is fairly small, so we know that rebounding is not what distinguishes good teams from bad teams. That means the actual talent difference among players cannot be nearly as large as players' individual REB totals seem to suggest. So any player metric that is driven mainly by rebounds cannot possibly be explaining actual wins and losses in the NBA.
Back to top
View user's profile Send private message
Eli W

Joined: 01 Feb 2005
Posts: 327

PostPosted: Tue Dec 18, 2007 6:12 pm Post subject: Reply with quote
I had thought about looking at SD(true) rather than SD(obs), but I wasn't sure if I could use the same method for teams that I would use to calculate the SD(true) for players. Your post suggests that the same method is appropriate (this MGL post made me unsure).

I will go back and recalculate the OCR's with SD(true)'s.

For your example stats, using just the 2006-07 season, the OCR(obs) of FT% was 0.72 and the OCR(obs) of 3P% was 0.55. The OCR(true) of FT% rose to 0.80, and the OCR(true) of 3P% rose to 0.78 (not 1, as SDD(obs) was slightly greater than SD(rand), and SDO(obs) wasn't much higher than SDD(obs)).
_________________
Eli W. (formerly John Quincy)
CountTheBasket.com
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Harold Almonte

Joined: 04 Aug 2006
Posts: 372

PostPosted: Tue Dec 18, 2007 8:41 pm Post subject: Reply with quote
How to interpret this? It was suspected that FG% depended alittle more on the shooter than the defender, but this balance also depends on the distance to the ring, and of course the most FGAs are taken close, it could also intend that close FGAs tends to be defended more contested given the space. Deflections might not be as worth if they doesn't avoid a possible assist. The action of rebounding doesn't have more incidence in gaining the possession than what the floor positioning advantage does for it, and probably doesn't deserve all the weight against the FGMissed, and this is regardless the inside game is biased to defense.
Can we take this study like a kind of test?
Back to top
View user's profile Send private message
Guy

Joined: 02 May 2007
Posts: 43

PostPosted: Wed Dec 19, 2007 10:53 am Post subject: Reply with quote
Eli: When you revise your OCRs, any chance of including the observed SDs in your spreadsheet? Would be a nice resource.

I think another use of the team SDs is that we can infer something about the actual spread of talent at the player level. Using rebounds as an example, the SD is around 1.4 REB/game. With 5 players, and PSD=player SD, 1.4^2 = PSD^2 + PSD^2+ PSD^2+ PSD^2+ PSD^2. That would give us a player SD of about 0.63 for REB48. In other words, the difference between a good (+1 SD) and bad (-1 SD) rebounder is about 1.2 rebounds per game. In comparison, I believe the SD for REB48 is around 3.8, or 6 times as large as the real spread in talent. That suggests rebound totals are much more a function of role and opportunities than true rebounding talent.

However, people on this board with better statistical chops than mine should weigh in if they think there are incorrect assumptions here.

Last edited by Guy on Fri Dec 21, 2007 8:56 pm; edited 1 time in total
Back to top
View user's profile Send private message
John Hollinger

Joined: 14 Feb 2005
Posts: 95

PostPosted: Fri Dec 21, 2007 6:02 pm Post subject: Reply with quote
This is some fascinating stuff. On the assists thing, I presume the defense controls it by their decision to double-team the post and/or trap the S/R.

On the stats in general, as you mentioned, another way to determine who "controls" is to look further at deviations among individual players. No player has his shot blocked as infrequently as Luke Ridnour blocks them, nor as frequently as Josh Smith does -- thus we can reasonably assume the defense has much more control over this event.