NBA to use six cameras to collect statistical data
Posted: Fri Apr 15, 2011 3:41 am
page 1 of 2
Author Message
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Fri Jun 12, 2009 6:07 pm Post subject: NBA to use six cameras to collect statistical data Reply with quote
A sportswriter had mentioned this in his NBA column yesterday, but I forgot who/where he was. But this article has now gone out over AP today.
http://tinyurl.com/mbtzw2
or
http://www.chicagotribune.com/business/ ... 8197.story
Major League Baseball's Pitch F/X technology has revolutionized sabrmetrics. The NBA is about to do the same, using six high-def cameras, not for the purpose of creating video per se, but for generating high-granularity data: players' locations on the floor, shot locations, shot trajectories (e.g. one non-statistical purpose is to see if a shot was goaltended or a legit block), evaluating players' defensive positioning, etc.
The possibilities are endless and breathaking.
Unfortunately, I suspect that one will have to be an NBA insider to get one's hands on these data. STATS LLC is operating the system, they're not as bad as Elias in terms of monopolizing data but I'm not expecting to see them post the data on a public website (the articles don't say).
Can our members who are on the inside of the NBA make any comments about this -- prototype studies, possible uses, etc.?
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Fri Jun 12, 2009 6:15 pm Post subject: Reply with quote
Quote:
The league hopes the technology will be ready for use by next season's playoffs, where it could be an asset to the teams involved and the broadcasts.
I don't suspect we'll be using it to do any major collection anytime soon...
Getting software that can collect data from video of NBA games will clearly push forward the understanding with respect to player locations and the like.
This is the first step in collecting some new data on a large scale. Nuances of the game will have to be figured out (like can we tell if a player sets a screen...?), but this would be a good move in that direction.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Fri Jun 12, 2009 6:29 pm Post subject: Reply with quote
I guess the other thing that I'm wondering is where the push for this is coming from ... the technology is ripe for utilization, but I'm sure this is still an expensive proposition. I'd be surprised if the NBA's statistical analysts had enough pull to convince the league to spend what I presume is millions of dollars just so they can get more stats. But maybe coaching staffs, who are already spending many man-hours and dollars on video were pushing for this (I think there's at least one third-party company which already does business with NBA teams basically by watching video for them and tabulating the results)? Or Mark Cuban, perhaps with a few other analytically minded owners or GMs? Or will the networks be given access to the video and stats, to use in their broadcasts? (Doesn't sound like it, from the article.)
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Fri Jun 12, 2009 7:41 pm Post subject: Reply with quote
I think this sort of thing is a natural evolution. It could be coming from the needs of more than one source, although our interest is clearly in data collection.
Unless we here something publicly about it, I think it's safe to assume this sort of technology can be utilized by a variety of groups in the NBA. Thus it's the sort of an endeavor taken on to fill multiple needs.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
John Hollinger
Joined: 14 Feb 2005
Posts: 175
PostPosted: Sat Jun 13, 2009 5:16 pm Post subject: Reply with quote
I got a demo of this thing before Game 4, and it is pretty darn cool. They're still working out the kinks and obviously they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
Back to top
View user's profile Send private message AIM Address
flyerfanatic
Joined: 14 May 2009
Posts: 5
PostPosted: Sat Jun 13, 2009 6:16 pm Post subject: Reply with quote
this sounds cool. are they going to allow people to see what data it collects?
Back to top
View user's profile Send private message
royce.toyfu
Joined: 11 Jul 2006
Posts: 19
PostPosted: Mon Jun 15, 2009 7:08 am Post subject: Reply with quote
John Hollinger wrote:
I got a demo of this thing before Game 4, and it is pretty darn cool. They're still working out the kinks and obviously they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
John, do you think you'll be able to get access to this footage through ESPN? Could there be a sort of Edge NFL Matchup equivalent done for the NBA?
Back to top
View user's profile Send private message
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Mon Jun 15, 2009 5:09 pm Post subject: Reply with quote
(I want to say something fairly brief on this so I will.)
I have thought and speculated about the value of this kind of data before (and talked about a calculus-based global model of basketball at different levels of analysis).
It would help move even further from looking at discrete players (or say particles) and beyond just lineups (particle sets with some unity / some independence) to lineups running plays or systems and interacting with the opponent's. Mass / energies in motion with characteristics that guide / reveal "what they want to do" and determine how they interact with others.
Which NBA team hires the first physicist to analyze the data?
Economists, environmental and mechanical engineers and clinical psychologists and poker geniuses and others surely have models. tools, awarenesses that would be useful.
But I'd see if a physicist (probably at the particle level) could use his tool set to model and analyze this information. Seems apt to try to me.
The GMs probably won't. Well maybe one guy might, if he is reading and still has budget authority left and isn't satiated yet.
If any of the data becomes publicly available or if public clones eventually reveal it somebody should mess around with it in this way. Maybe Free Darko could find somebody and then move the analysis of the data into terms that the insiders or the fans can relate to. Or at least a few.
This could revolutionize basketball analytics but I assume David Stern will eventually think whether he wants to go further down this road. There are so many fans that react negatively to this direction. Could this not only revolutionize basketball analytics but also change "the game" too far? Which is viewed to a large extent by the league and owners as an entertainment product instead of as a "pure" or epic sport (the view of say a Jerry West) or physical/intellectual challenge / puzzle (the view of many here?) It might. Or maybe it can be accommodated, as the world in so many areas gets more high-tech, high-thought but mostly behind the scenes and little known or understood by most.
Or maybe this ends up just being or mostly about building a better video game.
Last edited by Mountain on Tue Jun 16, 2009 1:05 pm; edited 2 times in total
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Mon Jun 15, 2009 8:32 pm Post subject: Reply with quote
Mountain wrote:
I have thought and speculated about the value of this kind of data before (and talked about a calculus-based global model of basketball at different levels of analysis).
Can you explain this further? It sounds interesting, but I'm not really sure what "a calculus-based global model" really means.
Back to top
View user's profile Send private message Send e-mail AIM Address
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Mon Jun 15, 2009 10:46 pm Post subject: Reply with quote
Alright Gabe, since you asked for more.
(Though I've said much of it before.)
I've talked about possibly taking adjusted +/- down from player level or offensive / defensive splits to the 4 Factors of each a few times over the last couple years and even about splitting them into the direct part (the adjusted input credit for just where the player had the lead role in the play action) and the apparent team level adjusted affects (for all the other plays when he isn't). Splitting adjusted into 16 parts and using that to represent the full range of a player's impacts.
I've mentioned that you could try to write equations that go from attributes (age/experience, standing reach, wingspan, weight, hops, pure speed, pure agility, dribble speed and agility, basketball intelligence, quality of previous instruction, shot chart preferences, etc.) to factor performance (factor performance as perhaps the calculus derivative of 16 unique formulas based on player attributes for each partial Factor- found by some form of regression or using some other method of hierarchical / multilevel modeling or aided by other ways to get at covariances) though that would probably be difficult and may end up with limited success as a global model of all players but you might find some useful tidbits along the way.
Recently I've been on the lineups as the unit of production train of thinking. But I'd modify that and try to fill the gap between what the sum of all the 16 parts of the player adjusted data suggests and what the adjusted lineup data suggests by proposing that lineup performance is the sum of the individual adjusted factor data points + any interactive effects of pairs (I think there are 10 pairs on the court at any time) or perhaps production subunits- perimeters / interiors (that would be simpler and maybe more meaningful) or maybe use both. Maybe the interactive effects could be modeled or derived with the aid of calculus too.
Move from player capabilities to expressed production (derivation) and then put them together along with the interactive elements (integration).
But you probably need to add coach and system on their own (on average). And the coach's choices in giving out roles on plays and calling plays.
And use splits by opponent type as well. And maybe add the ref affects.
And you could go from average adjusted player factors to marginal player factors under various conditions. Again I think calculus could play a role if you had a function for what was contributing to what was going on that was accurate.
And then review the coach's actual pattern vs. optimal use of all this information too in giving out roles and calling plays. The coach's marginal impacts. Apparently we are going to hearing soon that coaching doesn't matter much? I think the Finals and a lot of the rest of the data I see suggests it matters pretty heavily.
Some inside and on blogs seem to largely think of it as a sum of discrete individuals but the variance of lineup data appears to suggest otherwise on adjusted +/- (though I haven't done enough to prove this and the mostly small samples make it near impossible to do to any meaningful degree of confidence).
Even at the player pair level there are many things that could be real vs noise- players who use different zones and open other areas to others, players who prefer the exact same zones, players with ability to score from certain zones but not given a role to do so in the play or the system, players who can and will make a certain type pass to a team vs not, where players rebound best, the impact of a guy who over-pressures on the perimeter alongside an interior guy who is quick to respond vs not, etc. But with the compiled video and the data from it you can get an indication from one tool and then try to map to the another.
And you could look at the data by position, types, type sequences and more combinations of elements and levels of study. I shared some of these type research variations / extensions with a certain GM but whether it was read or read closely with an open (perhaps in spite of writing style) or read and rejected or read and considered already done and surpassed I probably won't find out. Oh well, his choice from a position where he can make that call better than I can second-guess it, probably.
Here are some of the earlier posts along these lines:
http://tinyurl.com/nro2t9
http://tinyurl.com/mauwnt
And I'll mentioned this fairly recent one again since it gets at the quality of the adjusted data and ways to re-run it and maybe learn more.
http://tinyurl.com/m2bdoa
In case anyone else wants to take a stab (or another one) at understanding it. Admittedly it is not that well written and I guess hard to follow. But I still think the underlying ideas are worth pursuing, just to see what it finds then evaluate the findings.
And by the way if anybody on the inside is really paying attention you could use this method to get at least a first level estimate of adjusted on plays for a specific team (using a method similar to Eli W/'s first lineup adjusted post) if you mapped the data to the video or at least the start & end conditions of plays (using that or perhaps a method similar to what Ryan Parker did)
Shot charts should play a pretty large role in trying to advance the understanding of why some lineups work better than others. Ryan Parker already has pretty exact floor locations of shots, an aptitude for working with databases and interest in the connection between start and end events and has already started looking at how players being on the court affects team level shot charts. Maybe it would be worthwhile to look at whether a made inside shot (or say 3) has a recognizable and pretty immediate impact on the effectiveness of the outside game or vice versa with a made 3 pointer (or a trio of them). Maybe all the running around before a shot is launched matters a good deal or maybe you can learn enough from just the shot data- at least til you have access to more data to analyze the running around to see what matters, what creates the most good looks in the right places.
On another issue:
Jon Nichols and Eric Weiss are digging further into the college to NBA transition projection field (following John Hollinger and Hoopsanalyst). Maybe someone will check whether PER or WinScore or some other metric at the college level is better at projecting NBA performance or look at the continuum of WS divided by PER (by position) and see whether the folks in the middle are the most stable or if one end outperforms the other or if it varies between interior and perimeter players. With the big differences being how shooting efficiency and defensive rebounding are scored. Maybe PER does better with perimeter players and WinScore with bigs? I'd guess that but it could end up being the reverse and surprising.
Last edited by Mountain on Sat Jun 20, 2009 2:20 pm; edited 16 times in total
Back to top
View user's profile Send private message
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Tue Jun 16, 2009 12:16 am Post subject: Reply with quote
Quote:
I don't suspect we'll be using it to do any major collection anytime soon...
Quote:
they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
Hmm, so the NBA may be putting the data-collection Cart before the analytical Horse. I.e. setting up the technological infrastructure to collect the data before, apparently, having the data-analysis infrastructure to know what to do with this soon-to-be goldmine.
Let the Gold Rush begin. Mountain is right, there should soon be more hiring of economists, engineers, physicists, etc. to start staking claims in these goldfields.
(There's a reverse twist here: with the California Gold Rush, mining stakes were public information, and the goldminers got to keep their gold personally, i.e. the private property was the gold itself. With these data and NBA teams' proprietary analysts, the "gold", i.e. the data, will be semi-public (infinitely share-able in theory with the public or a limited set thereof), and the "mining information", i.e. the analytical results, will be kept private, meaning kept secret by each team. Instead of Public Info and Private Gold, it'll be Public Gold and Private Info.)
Some additional idle wild speculation along Mountain's lines:
Another example of the "calculus-based" rather than discrete analysis that might soon be possible is to basically apply the Cartesian revolution to basketball events (probably using polar coordinates rather than Descartes' x-y coordinates however), i.e. distance and angle from the basket. The shot charts that we see almost universally nowadays in theory are doing the same thing -- indicating the location where the shot was taken from -- but a hand-drawn shot chart compared to a computer-measured-and-stored database of player and shot locations is like comparing geometry to analytic geometry.
This is already occuring in sabrmetrics, where there are at least a couple of defensive rating systems which, instead of looking at zones of the baseball field, measure distance and angle from homeplate. (Kenny Shirley presented one such system, SAFE, at the last NESSIS, BTW the deadline for sending in a proposal for the next one was today).
OTOH, basketball unlike baseball does have some zones which are inherent to the game: the key and the 3-point line in particular, plus the midcourt line and the no-charge area. (And of course the sidelines and endlines, but even baseball has foul lines.)
Some further speculation along the lines that I think Mountain is getting at: instead of looking at players as the discrete units, each of whom takes the court with various attributes (rebounding prowess, shooting ability, etc.), I think he is proposing to look at the attributes themselves -- i.e. Rebounding, Shooting, etc. -- and measure how various combinations of players lead to lineups with various rebounding, shooting, etc. abilities. Somewhat analagous to going from fixed effects models to random effects models in econometrics, or from dummy variables to parametric measures.
I think there have been three revolutions in hoopstats. The first one, which evolved over decades, was the analytical quantitative approach that we see on this website (a nod to DeanO and others, but he says that Dean Smith had been using some of these measures already). The second one was the widespread availability of play-by-play data, and analysis thereof (a nod to RolandB here). I think these hi-def videos and their associated data will be the third one (not sure who to give the nod to; one to the NBA for doing this, and the second one to the goldminers who find the first lodes).
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Tue Jun 16, 2009 11:52 am Post subject: Reply with quote
Mountain wrote:
I've talked about possibly taking adjusted +/- down from player level or offensive / defensive splits to the 4 Factors of each a few times over the last couple years and even about splitting them into the direct part (where the player had the lead role in the play action) and the apparent team level adjusted affects (when he isn't). Splitting adjusted into 16 parts and using that to represent the full range of a player's impacts.
I've mentioned that you could try to write equations that go from attributes (age/experience, standing reach, wingspan, weight, hops, pure speed, pure agility, dribble speed and agility, basketball intelligence, quality of previous instruction, shot chart preferences, etc.) to factor performance (factor performance as perhaps the calculus derivative of 16 unique formulas based on player attributes for each partial Factor- found by some form of regression or using some other method of hierarchical / multilevel modeling or aided by other ways to get at covariances) though that would probably be difficult and may end up with limited success as a global model of all players but you might find some useful tidbits along the way.
So instead of just an overall +/-, it would give you like a team-wise eFG% +/-, a team-wise OReb% +/-, etc, etc. That's what you mean?
Back to top
View user's profile Send private message Send e-mail AIM Address
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Wed Jun 17, 2009 2:25 am Post subject: Reply with quote
I meant what I said but I'll try to explain one more time in a bit expanded fashion.
I would ideally taken adjusted +/- down to the 4 factors. To do so you'd run the regression multiple times (instead of just one pass based on points) using shots made, rebounds, turnovers as the input and the adjusted factor for FT/FG could be found by taking the compete adjusted and subtracting the values found for the other parts.
But to separate the player's direct impact from his team level influence for any factor you'd run the regression twice. For example giving credit one time just when he scored a hoop directly. Then you'd do it again from when anyone else did on the team. And I'd do that for the other factors, offense and defense. That would give you 16 partial Factors.
This would allow to see estimates of a player's direct Factor impacts and as you say team level impacts (on other teammates) for all the Factors. It is both, not just one.
But this is just theoretical at this time. It is doable and might be worth doing if you were an insider with time and the required zeal to know as much as possible.
If you had this level of information and adjusted 4 factors for lineups and player pairs and the video information reference in the original post I think you could figure a lot of what is actually happening positive and negative and how to further optimize and be way better off than just receiving the composite adjusted +/- or even the offensive / defensive splits and wondering went exactly pushed it there.
Was it his shooting, or his passing to other shooters or his direct turnovers or his decision-making that lead to turnovers elsewhere or his trips to the line or his ability to hit teammates and get them to the line or his rebounding directly or maybe impact on rebounding by boxing out or occupying multiple defenders? Splitting adjusted into 8 partials for offense and 8 for defense would given you some leads.
I am not sure if the errors would stay the same as for the full adjusted or get worse. But that is an issue for the future.
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Wed Jun 17, 2009 3:22 pm Post subject: Reply with quote
Mountain wrote:
I would ideally taken adjusted +/- down to the 4 factors. To do so you'd run the regression multiple times (instead of just one pass based on points) using shots made, rebounds, turnovers as the input and the adjusted factor for FT/FG could be found by taking the compete adjusted and subtracting the values found for the other parts.
Right, that's essentially what I said.
Mountain wrote:
But to separate the player's direct impact from his team level influence for any factor you'd run the regression twice. For example giving credit one time just when he scored a hoop directly. Then you'd do it again from when anyone else did on the team. And I'd do that for the other factors, offense and defense. That would give you 16 partial Factors.
Are you familiar with the issues surrounding multiplicity, alpha spending functions, and controlling Type I Error? Performing the procedure you describe would open up a swath of issues regarding the veracity of any results obtained. Perhaps this is something to consider.
Mountain wrote:
I am not sure if the errors would stay the same as for the full adjusted or get worse. But that is an issue for the future.
My guess is they would get worse, since you're teasing apart the data even further.
Back to top
View user's profile Send private message Send e-mail AIM Address
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Wed Jun 17, 2009 3:42 pm Post subject: Reply with quote
So multiplicity seems interesting to me. Here's something to read later: http://www.jstor.org/pss/2531814
This is the best my library could do. Point me to this.
Laughing
page 2 of 2
Author Message
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Thu Jun 18, 2009 1:56 am Post subject: Reply with quote
Thanks, when I've pitched this sketch several times I was hoping for some feedback on feasibility or challenges or ways to accomplish it.
when time permits I'll check further into the other topics you raise.
I can see that the significance level might need to shift. As with all this adjusted data at best you will end up fairly confident about the most of the worst and best and not that confident of the level of the rest. Still that has some value and then you can check the tape or memory and decide how far to believe or adjust in specific cases. Believe rather than "know" for sure.
Until the partial Factor level adjusted data is derived you can look at the factor and partial Factor level raw data and the adjusted +/- or offensive and defensive splits and other data and make some guesses about what the most significant adjusted partial Factors might be and get sense of their sign and in some cases magnitude. But there could be multiple sets of Factor or partial Factor solutions to the composite level adjusted scores for players of roughly similar power.
Last edited by Mountain on Fri Jun 19, 2009 8:04 am; edited 1 time in total
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Thu Jun 18, 2009 1:25 pm Post subject: Reply with quote
Ryan J. Parker wrote:
So multiplicity seems interesting to me. Here's something to read later: http://www.jstor.org/pss/2531814
This is the best my library could do. Point me to this. Laughing
That Gelman Bayesian book looks interesting, but the only mention of multiplicity is somewhere in the references, as far as I can tell.
The article you linked to seems to be on the right track.
The classic reference in my day job is this, but I'm not sure where you can find a copy of it since it's fairly old.
Back to top
View user's profile Send private message Send e-mail AIM Address
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC
PostPosted: Thu Jun 18, 2009 1:52 pm Post subject: Reply with quote
Yeah it was just a reference. Sad
Oh, and nothing is too old for JSTOR!
http://www.jstor.org/pss/2530245
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Thu Jun 18, 2009 4:25 pm Post subject: Reply with quote
By coincidence I attended a seminar a couple of weeks ago where Brad Efron presented a paper on using Emprical Bayesian techniques to reduce one aspect of the multiplicity problem: selecting variables to use in a multivariate regression. He has even made available a program, written in R, which does the estimation (and in a typical joke, calls his program "EBay"). The paper and program are on his webpage, under the entries for 2008.
One side note that he mentioned during the seminar, which I wasn't familiar with, is that Bayesian estimation techniques are immune (perhaps under certain conditions?) to the multiplicity problems of classical/frequentist statistics: i.e. no need to shrink or regress estimates to the mean. At least I think that's what he said, it was an offhand comment, and I do not have a good knowledge of Bayesian statistics.
But now this paper says that Emprical Bayesian estimates may often differ significantly from Bayesian estimates, suggesting perhaps that Efron's EBay solution may not be adequate.
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC
PostPosted: Thu Jun 18, 2009 4:29 pm Post subject: Reply with quote
Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Thu Jun 18, 2009 4:48 pm Post subject: Reply with quote
Ryan J. Parker wrote:
Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though.
Yeah, I'm no expert, here's a nice short summary of some views about Emprical Bayesian techniques, including Gelman's viewpoint.
From Efron's talk, I gather that one of the problems with Emprical Bayesian techniques is that the estimates have larger standard errors (greater uncertainty) than calculated, and maybe bias as well -- presumably because the estimates are based, not on true priors, but on parameters estimated from the data. But you don't know the standard errors of those estimated parameters ... or maybe it's hard to calclulate how that uncertainty leads to additional uncertainty in the final Empirical Bayes estimates.
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Fri Jun 19, 2009 7:42 am Post subject: Reply with quote
Ryan J. Parker wrote:
Yeah it was just a reference. Sad
Oh, and nothing is too old for JSTOR!
http://www.jstor.org/pss/2530245
Yup, that's it. I can't tell if you can get the full article or not, but if you can it's definitely worth reading.
Back to top
View user's profile Send private message Send e-mail AIM Address
tpryan
Joined: 11 Feb 2005
Posts: 100
PostPosted: Sun Jun 21, 2009 4:16 am Post subject: Reply with quote
Of course what Gabe was saying is that if many tests are performed, a "significant" result or two could be obtained due to chance alone. There is not a simple solution to that problem, in general, and adjusting alpha levels for individual tests can be too conservative.
I am not a Bayesian, nor an expert on it, but I tend to agree with Gelman regarding terminology. One starts with a reasonable prior, maybe even a noninformative prior, then combines that with data to obtain the posterior. Systems change over time, as George Box has emphasized, prefering to think of rapid change, so the posterior becomes the next prior, then posterior_2 is produced from more data, etc.
"EBay". Clever. Very Happy
Back to top
View user's profile Send private message
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Mon Jul 20, 2009 3:47 pm Post subject: Reply with quote
Back to the video-based data capture that started this thread: Sportsvision (the same company that brings you those yellow virtual first-down lines on TV football broadcasts, as well as baseball's Pitch F/X data) recently unveiled the next generation beyond Pitch F/X: tracking and timing of balls and players.
The prototype system is in place in San Francisco (I refuse to even attempt to keep up with the commercially-based name changes of ballparks and arenas, it was originally called PacBell Field). They recently had an all-day mini-conference in San Francisco to talk about the logistics and ins and outs of this technology. There was even a presentation about creating "heat maps" of Pitch F/X data (rather than scatterplots), which sounds similar to the colorful shot charts recently discussed here.
Although the nature and flow of basketball games are very different from those in baseball, I hope that the NBA and MLB and their contractors are communicating and cooperating; this is all new stuff and rather than independently re-inventing the wheel, I think all of the sports and technologists could probably learn a lot about new techniques and best practices from each other. The NBA's upcoming system has been described as being provided by STATS LLC, but I don't know if they're literally doing the hardware, technology, etc., I think of them as being a data company rather than a technology company. Maybe STATS is already partnering with Sportsivision? Sportsvision's website says that they are the source of Hoops F/X data, evidently used by TV broadcasters. Did any NBA reps attend the Pitch F/X mini-conference (which evidently was open to anybody, all it lacked was publicity)?
Additional hopes for the future: whatever the NBA and STATS end up calling their 6-HD-camera setup ("Hoopsvision"?), I hope they make the data publicly available and organize conferences (or participate in existing ones such as Sloan, NESSIS, or NCSSORS). At NCSSORS, someone mentioned the reams of data that the NFL has -- but doesn't share. I think that's a mistake on the NFL's part, an exampe of 20th century thinking. Yes it cost them probably millions of dollar to create and collect those data, but by only sharing it within the NFL (or licensing the data for a very high price) they limit the amount of research that can utilize the data. 21st century thinking would tell them to make the data freely available; there are literally hundreds if not thousands of fans and would-be analysts who would love nothing more than to jump on those data and start doing analysis -- all for free. If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!)
Back to top
View user's profile Send private message
HoopStudies
Joined: 30 Dec 2004
Posts: 705
Location: Near Philadelphia, PA
PostPosted: Mon Jul 20, 2009 4:45 pm Post subject: Reply with quote
mtamada wrote:
...If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!)
And, yes, I know what to do with the data. Definitely a good challenge, bringing every ounce of PhD training I got.
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.
Back to top
View user's profile Send private message Visit poster's website
Crow
Joined: 20 Jan 2009
Posts: 815
PostPosted: Mon Jul 20, 2009 8:39 pm Post subject: Reply with quote
Detailed video translated into a multi-factor database would get at the situational FG%s of mid-range shots- open or degree contested along with time of shot clock and perhaps catch n shoot versus off the dribble. That would aid the management / reduction of mid-range shots.
Ideally you could use such a database to look at play sequences and try to find optimized sequences for your team vs different team types / lineup mixes and defensive schemes (based perhaps largely on where you get shot a and expected payoff instead of actual?), using the mid-range as a part of overall strategy, to the extent that you normally have to and not beyond that. In chess often the masters think in what 10 or 20 move sequences? Do the best NBA coaches?
And going beyond sequences you could usefully examine plays and how the swirl of motion and player attributes in that motion with their potentialities lead to more or less open and good shots. And then try to repeat the most successful plays and the critical pieces of plays precisely. If the cameras are fixed you could compare a successful, pretty play to other real game versions of it down to inches or practice it until it sufficiently fits the pattern.
Author Message
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Fri Jun 12, 2009 6:07 pm Post subject: NBA to use six cameras to collect statistical data Reply with quote
A sportswriter had mentioned this in his NBA column yesterday, but I forgot who/where he was. But this article has now gone out over AP today.
http://tinyurl.com/mbtzw2
or
http://www.chicagotribune.com/business/ ... 8197.story
Major League Baseball's Pitch F/X technology has revolutionized sabrmetrics. The NBA is about to do the same, using six high-def cameras, not for the purpose of creating video per se, but for generating high-granularity data: players' locations on the floor, shot locations, shot trajectories (e.g. one non-statistical purpose is to see if a shot was goaltended or a legit block), evaluating players' defensive positioning, etc.
The possibilities are endless and breathaking.
Unfortunately, I suspect that one will have to be an NBA insider to get one's hands on these data. STATS LLC is operating the system, they're not as bad as Elias in terms of monopolizing data but I'm not expecting to see them post the data on a public website (the articles don't say).
Can our members who are on the inside of the NBA make any comments about this -- prototype studies, possible uses, etc.?
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Fri Jun 12, 2009 6:15 pm Post subject: Reply with quote
Quote:
The league hopes the technology will be ready for use by next season's playoffs, where it could be an asset to the teams involved and the broadcasts.
I don't suspect we'll be using it to do any major collection anytime soon...
Getting software that can collect data from video of NBA games will clearly push forward the understanding with respect to player locations and the like.
This is the first step in collecting some new data on a large scale. Nuances of the game will have to be figured out (like can we tell if a player sets a screen...?), but this would be a good move in that direction.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Fri Jun 12, 2009 6:29 pm Post subject: Reply with quote
I guess the other thing that I'm wondering is where the push for this is coming from ... the technology is ripe for utilization, but I'm sure this is still an expensive proposition. I'd be surprised if the NBA's statistical analysts had enough pull to convince the league to spend what I presume is millions of dollars just so they can get more stats. But maybe coaching staffs, who are already spending many man-hours and dollars on video were pushing for this (I think there's at least one third-party company which already does business with NBA teams basically by watching video for them and tabulating the results)? Or Mark Cuban, perhaps with a few other analytically minded owners or GMs? Or will the networks be given access to the video and stats, to use in their broadcasts? (Doesn't sound like it, from the article.)
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Fri Jun 12, 2009 7:41 pm Post subject: Reply with quote
I think this sort of thing is a natural evolution. It could be coming from the needs of more than one source, although our interest is clearly in data collection.
Unless we here something publicly about it, I think it's safe to assume this sort of technology can be utilized by a variety of groups in the NBA. Thus it's the sort of an endeavor taken on to fill multiple needs.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
John Hollinger
Joined: 14 Feb 2005
Posts: 175
PostPosted: Sat Jun 13, 2009 5:16 pm Post subject: Reply with quote
I got a demo of this thing before Game 4, and it is pretty darn cool. They're still working out the kinks and obviously they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
Back to top
View user's profile Send private message AIM Address
flyerfanatic
Joined: 14 May 2009
Posts: 5
PostPosted: Sat Jun 13, 2009 6:16 pm Post subject: Reply with quote
this sounds cool. are they going to allow people to see what data it collects?
Back to top
View user's profile Send private message
royce.toyfu
Joined: 11 Jul 2006
Posts: 19
PostPosted: Mon Jun 15, 2009 7:08 am Post subject: Reply with quote
John Hollinger wrote:
I got a demo of this thing before Game 4, and it is pretty darn cool. They're still working out the kinks and obviously they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
John, do you think you'll be able to get access to this footage through ESPN? Could there be a sort of Edge NFL Matchup equivalent done for the NBA?
Back to top
View user's profile Send private message
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Mon Jun 15, 2009 5:09 pm Post subject: Reply with quote
(I want to say something fairly brief on this so I will.)
I have thought and speculated about the value of this kind of data before (and talked about a calculus-based global model of basketball at different levels of analysis).
It would help move even further from looking at discrete players (or say particles) and beyond just lineups (particle sets with some unity / some independence) to lineups running plays or systems and interacting with the opponent's. Mass / energies in motion with characteristics that guide / reveal "what they want to do" and determine how they interact with others.
Which NBA team hires the first physicist to analyze the data?
Economists, environmental and mechanical engineers and clinical psychologists and poker geniuses and others surely have models. tools, awarenesses that would be useful.
But I'd see if a physicist (probably at the particle level) could use his tool set to model and analyze this information. Seems apt to try to me.
The GMs probably won't. Well maybe one guy might, if he is reading and still has budget authority left and isn't satiated yet.
If any of the data becomes publicly available or if public clones eventually reveal it somebody should mess around with it in this way. Maybe Free Darko could find somebody and then move the analysis of the data into terms that the insiders or the fans can relate to. Or at least a few.
This could revolutionize basketball analytics but I assume David Stern will eventually think whether he wants to go further down this road. There are so many fans that react negatively to this direction. Could this not only revolutionize basketball analytics but also change "the game" too far? Which is viewed to a large extent by the league and owners as an entertainment product instead of as a "pure" or epic sport (the view of say a Jerry West) or physical/intellectual challenge / puzzle (the view of many here?) It might. Or maybe it can be accommodated, as the world in so many areas gets more high-tech, high-thought but mostly behind the scenes and little known or understood by most.
Or maybe this ends up just being or mostly about building a better video game.
Last edited by Mountain on Tue Jun 16, 2009 1:05 pm; edited 2 times in total
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Mon Jun 15, 2009 8:32 pm Post subject: Reply with quote
Mountain wrote:
I have thought and speculated about the value of this kind of data before (and talked about a calculus-based global model of basketball at different levels of analysis).
Can you explain this further? It sounds interesting, but I'm not really sure what "a calculus-based global model" really means.
Back to top
View user's profile Send private message Send e-mail AIM Address
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Mon Jun 15, 2009 10:46 pm Post subject: Reply with quote
Alright Gabe, since you asked for more.
(Though I've said much of it before.)
I've talked about possibly taking adjusted +/- down from player level or offensive / defensive splits to the 4 Factors of each a few times over the last couple years and even about splitting them into the direct part (the adjusted input credit for just where the player had the lead role in the play action) and the apparent team level adjusted affects (for all the other plays when he isn't). Splitting adjusted into 16 parts and using that to represent the full range of a player's impacts.
I've mentioned that you could try to write equations that go from attributes (age/experience, standing reach, wingspan, weight, hops, pure speed, pure agility, dribble speed and agility, basketball intelligence, quality of previous instruction, shot chart preferences, etc.) to factor performance (factor performance as perhaps the calculus derivative of 16 unique formulas based on player attributes for each partial Factor- found by some form of regression or using some other method of hierarchical / multilevel modeling or aided by other ways to get at covariances) though that would probably be difficult and may end up with limited success as a global model of all players but you might find some useful tidbits along the way.
Recently I've been on the lineups as the unit of production train of thinking. But I'd modify that and try to fill the gap between what the sum of all the 16 parts of the player adjusted data suggests and what the adjusted lineup data suggests by proposing that lineup performance is the sum of the individual adjusted factor data points + any interactive effects of pairs (I think there are 10 pairs on the court at any time) or perhaps production subunits- perimeters / interiors (that would be simpler and maybe more meaningful) or maybe use both. Maybe the interactive effects could be modeled or derived with the aid of calculus too.
Move from player capabilities to expressed production (derivation) and then put them together along with the interactive elements (integration).
But you probably need to add coach and system on their own (on average). And the coach's choices in giving out roles on plays and calling plays.
And use splits by opponent type as well. And maybe add the ref affects.
And you could go from average adjusted player factors to marginal player factors under various conditions. Again I think calculus could play a role if you had a function for what was contributing to what was going on that was accurate.
And then review the coach's actual pattern vs. optimal use of all this information too in giving out roles and calling plays. The coach's marginal impacts. Apparently we are going to hearing soon that coaching doesn't matter much? I think the Finals and a lot of the rest of the data I see suggests it matters pretty heavily.
Some inside and on blogs seem to largely think of it as a sum of discrete individuals but the variance of lineup data appears to suggest otherwise on adjusted +/- (though I haven't done enough to prove this and the mostly small samples make it near impossible to do to any meaningful degree of confidence).
Even at the player pair level there are many things that could be real vs noise- players who use different zones and open other areas to others, players who prefer the exact same zones, players with ability to score from certain zones but not given a role to do so in the play or the system, players who can and will make a certain type pass to a team vs not, where players rebound best, the impact of a guy who over-pressures on the perimeter alongside an interior guy who is quick to respond vs not, etc. But with the compiled video and the data from it you can get an indication from one tool and then try to map to the another.
And you could look at the data by position, types, type sequences and more combinations of elements and levels of study. I shared some of these type research variations / extensions with a certain GM but whether it was read or read closely with an open (perhaps in spite of writing style) or read and rejected or read and considered already done and surpassed I probably won't find out. Oh well, his choice from a position where he can make that call better than I can second-guess it, probably.
Here are some of the earlier posts along these lines:
http://tinyurl.com/nro2t9
http://tinyurl.com/mauwnt
And I'll mentioned this fairly recent one again since it gets at the quality of the adjusted data and ways to re-run it and maybe learn more.
http://tinyurl.com/m2bdoa
In case anyone else wants to take a stab (or another one) at understanding it. Admittedly it is not that well written and I guess hard to follow. But I still think the underlying ideas are worth pursuing, just to see what it finds then evaluate the findings.
And by the way if anybody on the inside is really paying attention you could use this method to get at least a first level estimate of adjusted on plays for a specific team (using a method similar to Eli W/'s first lineup adjusted post) if you mapped the data to the video or at least the start & end conditions of plays (using that or perhaps a method similar to what Ryan Parker did)
Shot charts should play a pretty large role in trying to advance the understanding of why some lineups work better than others. Ryan Parker already has pretty exact floor locations of shots, an aptitude for working with databases and interest in the connection between start and end events and has already started looking at how players being on the court affects team level shot charts. Maybe it would be worthwhile to look at whether a made inside shot (or say 3) has a recognizable and pretty immediate impact on the effectiveness of the outside game or vice versa with a made 3 pointer (or a trio of them). Maybe all the running around before a shot is launched matters a good deal or maybe you can learn enough from just the shot data- at least til you have access to more data to analyze the running around to see what matters, what creates the most good looks in the right places.
On another issue:
Jon Nichols and Eric Weiss are digging further into the college to NBA transition projection field (following John Hollinger and Hoopsanalyst). Maybe someone will check whether PER or WinScore or some other metric at the college level is better at projecting NBA performance or look at the continuum of WS divided by PER (by position) and see whether the folks in the middle are the most stable or if one end outperforms the other or if it varies between interior and perimeter players. With the big differences being how shooting efficiency and defensive rebounding are scored. Maybe PER does better with perimeter players and WinScore with bigs? I'd guess that but it could end up being the reverse and surprising.
Last edited by Mountain on Sat Jun 20, 2009 2:20 pm; edited 16 times in total
Back to top
View user's profile Send private message
mtamada
Joined: 28 Jan 2005
Posts: 376
PostPosted: Tue Jun 16, 2009 12:16 am Post subject: Reply with quote
Quote:
I don't suspect we'll be using it to do any major collection anytime soon...
Quote:
they need to develop all the back end stuff to track all the information that this thing is capable of collecting, but in the next 5-10 years I think this might be a total game-changer.
Hmm, so the NBA may be putting the data-collection Cart before the analytical Horse. I.e. setting up the technological infrastructure to collect the data before, apparently, having the data-analysis infrastructure to know what to do with this soon-to-be goldmine.
Let the Gold Rush begin. Mountain is right, there should soon be more hiring of economists, engineers, physicists, etc. to start staking claims in these goldfields.
(There's a reverse twist here: with the California Gold Rush, mining stakes were public information, and the goldminers got to keep their gold personally, i.e. the private property was the gold itself. With these data and NBA teams' proprietary analysts, the "gold", i.e. the data, will be semi-public (infinitely share-able in theory with the public or a limited set thereof), and the "mining information", i.e. the analytical results, will be kept private, meaning kept secret by each team. Instead of Public Info and Private Gold, it'll be Public Gold and Private Info.)
Some additional idle wild speculation along Mountain's lines:
Another example of the "calculus-based" rather than discrete analysis that might soon be possible is to basically apply the Cartesian revolution to basketball events (probably using polar coordinates rather than Descartes' x-y coordinates however), i.e. distance and angle from the basket. The shot charts that we see almost universally nowadays in theory are doing the same thing -- indicating the location where the shot was taken from -- but a hand-drawn shot chart compared to a computer-measured-and-stored database of player and shot locations is like comparing geometry to analytic geometry.
This is already occuring in sabrmetrics, where there are at least a couple of defensive rating systems which, instead of looking at zones of the baseball field, measure distance and angle from homeplate. (Kenny Shirley presented one such system, SAFE, at the last NESSIS, BTW the deadline for sending in a proposal for the next one was today).
OTOH, basketball unlike baseball does have some zones which are inherent to the game: the key and the 3-point line in particular, plus the midcourt line and the no-charge area. (And of course the sidelines and endlines, but even baseball has foul lines.)
Some further speculation along the lines that I think Mountain is getting at: instead of looking at players as the discrete units, each of whom takes the court with various attributes (rebounding prowess, shooting ability, etc.), I think he is proposing to look at the attributes themselves -- i.e. Rebounding, Shooting, etc. -- and measure how various combinations of players lead to lineups with various rebounding, shooting, etc. abilities. Somewhat analagous to going from fixed effects models to random effects models in econometrics, or from dummy variables to parametric measures.
I think there have been three revolutions in hoopstats. The first one, which evolved over decades, was the analytical quantitative approach that we see on this website (a nod to DeanO and others, but he says that Dean Smith had been using some of these measures already). The second one was the widespread availability of play-by-play data, and analysis thereof (a nod to RolandB here). I think these hi-def videos and their associated data will be the third one (not sure who to give the nod to; one to the NBA for doing this, and the second one to the goldminers who find the first lodes).
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Tue Jun 16, 2009 11:52 am Post subject: Reply with quote
Mountain wrote:
I've talked about possibly taking adjusted +/- down from player level or offensive / defensive splits to the 4 Factors of each a few times over the last couple years and even about splitting them into the direct part (where the player had the lead role in the play action) and the apparent team level adjusted affects (when he isn't). Splitting adjusted into 16 parts and using that to represent the full range of a player's impacts.
I've mentioned that you could try to write equations that go from attributes (age/experience, standing reach, wingspan, weight, hops, pure speed, pure agility, dribble speed and agility, basketball intelligence, quality of previous instruction, shot chart preferences, etc.) to factor performance (factor performance as perhaps the calculus derivative of 16 unique formulas based on player attributes for each partial Factor- found by some form of regression or using some other method of hierarchical / multilevel modeling or aided by other ways to get at covariances) though that would probably be difficult and may end up with limited success as a global model of all players but you might find some useful tidbits along the way.
So instead of just an overall +/-, it would give you like a team-wise eFG% +/-, a team-wise OReb% +/-, etc, etc. That's what you mean?
Back to top
View user's profile Send private message Send e-mail AIM Address
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Wed Jun 17, 2009 2:25 am Post subject: Reply with quote
I meant what I said but I'll try to explain one more time in a bit expanded fashion.
I would ideally taken adjusted +/- down to the 4 factors. To do so you'd run the regression multiple times (instead of just one pass based on points) using shots made, rebounds, turnovers as the input and the adjusted factor for FT/FG could be found by taking the compete adjusted and subtracting the values found for the other parts.
But to separate the player's direct impact from his team level influence for any factor you'd run the regression twice. For example giving credit one time just when he scored a hoop directly. Then you'd do it again from when anyone else did on the team. And I'd do that for the other factors, offense and defense. That would give you 16 partial Factors.
This would allow to see estimates of a player's direct Factor impacts and as you say team level impacts (on other teammates) for all the Factors. It is both, not just one.
But this is just theoretical at this time. It is doable and might be worth doing if you were an insider with time and the required zeal to know as much as possible.
If you had this level of information and adjusted 4 factors for lineups and player pairs and the video information reference in the original post I think you could figure a lot of what is actually happening positive and negative and how to further optimize and be way better off than just receiving the composite adjusted +/- or even the offensive / defensive splits and wondering went exactly pushed it there.
Was it his shooting, or his passing to other shooters or his direct turnovers or his decision-making that lead to turnovers elsewhere or his trips to the line or his ability to hit teammates and get them to the line or his rebounding directly or maybe impact on rebounding by boxing out or occupying multiple defenders? Splitting adjusted into 8 partials for offense and 8 for defense would given you some leads.
I am not sure if the errors would stay the same as for the full adjusted or get worse. But that is an issue for the future.
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Wed Jun 17, 2009 3:22 pm Post subject: Reply with quote
Mountain wrote:
I would ideally taken adjusted +/- down to the 4 factors. To do so you'd run the regression multiple times (instead of just one pass based on points) using shots made, rebounds, turnovers as the input and the adjusted factor for FT/FG could be found by taking the compete adjusted and subtracting the values found for the other parts.
Right, that's essentially what I said.
Mountain wrote:
But to separate the player's direct impact from his team level influence for any factor you'd run the regression twice. For example giving credit one time just when he scored a hoop directly. Then you'd do it again from when anyone else did on the team. And I'd do that for the other factors, offense and defense. That would give you 16 partial Factors.
Are you familiar with the issues surrounding multiplicity, alpha spending functions, and controlling Type I Error? Performing the procedure you describe would open up a swath of issues regarding the veracity of any results obtained. Perhaps this is something to consider.
Mountain wrote:
I am not sure if the errors would stay the same as for the full adjusted or get worse. But that is an issue for the future.
My guess is they would get worse, since you're teasing apart the data even further.
Back to top
View user's profile Send private message Send e-mail AIM Address
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 708
Location: Raleigh, NC
PostPosted: Wed Jun 17, 2009 3:42 pm Post subject: Reply with quote
So multiplicity seems interesting to me. Here's something to read later: http://www.jstor.org/pss/2531814
This is the best my library could do. Point me to this.
Laughing
page 2 of 2
Author Message
Mountain
Joined: 13 Mar 2007
Posts: 1527
PostPosted: Thu Jun 18, 2009 1:56 am Post subject: Reply with quote
Thanks, when I've pitched this sketch several times I was hoping for some feedback on feasibility or challenges or ways to accomplish it.
when time permits I'll check further into the other topics you raise.
I can see that the significance level might need to shift. As with all this adjusted data at best you will end up fairly confident about the most of the worst and best and not that confident of the level of the rest. Still that has some value and then you can check the tape or memory and decide how far to believe or adjust in specific cases. Believe rather than "know" for sure.
Until the partial Factor level adjusted data is derived you can look at the factor and partial Factor level raw data and the adjusted +/- or offensive and defensive splits and other data and make some guesses about what the most significant adjusted partial Factors might be and get sense of their sign and in some cases magnitude. But there could be multiple sets of Factor or partial Factor solutions to the composite level adjusted scores for players of roughly similar power.
Last edited by Mountain on Fri Jun 19, 2009 8:04 am; edited 1 time in total
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Thu Jun 18, 2009 1:25 pm Post subject: Reply with quote
Ryan J. Parker wrote:
So multiplicity seems interesting to me. Here's something to read later: http://www.jstor.org/pss/2531814
This is the best my library could do. Point me to this. Laughing
That Gelman Bayesian book looks interesting, but the only mention of multiplicity is somewhere in the references, as far as I can tell.
The article you linked to seems to be on the right track.
The classic reference in my day job is this, but I'm not sure where you can find a copy of it since it's fairly old.
Back to top
View user's profile Send private message Send e-mail AIM Address
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC
PostPosted: Thu Jun 18, 2009 1:52 pm Post subject: Reply with quote
Yeah it was just a reference. Sad
Oh, and nothing is too old for JSTOR!
http://www.jstor.org/pss/2530245
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Thu Jun 18, 2009 4:25 pm Post subject: Reply with quote
By coincidence I attended a seminar a couple of weeks ago where Brad Efron presented a paper on using Emprical Bayesian techniques to reduce one aspect of the multiplicity problem: selecting variables to use in a multivariate regression. He has even made available a program, written in R, which does the estimation (and in a typical joke, calls his program "EBay"). The paper and program are on his webpage, under the entries for 2008.
One side note that he mentioned during the seminar, which I wasn't familiar with, is that Bayesian estimation techniques are immune (perhaps under certain conditions?) to the multiplicity problems of classical/frequentist statistics: i.e. no need to shrink or regress estimates to the mean. At least I think that's what he said, it was an offhand comment, and I do not have a good knowledge of Bayesian statistics.
But now this paper says that Emprical Bayesian estimates may often differ significantly from Bayesian estimates, suggesting perhaps that Efron's EBay solution may not be adequate.
Back to top
View user's profile Send private message
Ryan J. Parker
Joined: 23 Mar 2007
Posts: 711
Location: Raleigh, NC
PostPosted: Thu Jun 18, 2009 4:29 pm Post subject: Reply with quote
Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though.
_________________
I am a basketball geek.
Back to top
View user's profile Send private message Visit poster's website
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Thu Jun 18, 2009 4:48 pm Post subject: Reply with quote
Ryan J. Parker wrote:
Interesting stuff mtamada. I'm still learning, so when someone says "empirical bayes" I'm not exactly sure what they're referring to. I know the general idea is that you're using data to create priors in which you then use those priors with the data, in a sense using the data twice. Gelman doesn't prefer this terminology, calling the empirical part redundant. Should be some good reading there, though.
Yeah, I'm no expert, here's a nice short summary of some views about Emprical Bayesian techniques, including Gelman's viewpoint.
From Efron's talk, I gather that one of the problems with Emprical Bayesian techniques is that the estimates have larger standard errors (greater uncertainty) than calculated, and maybe bias as well -- presumably because the estimates are based, not on true priors, but on parameters estimated from the data. But you don't know the standard errors of those estimated parameters ... or maybe it's hard to calclulate how that uncertainty leads to additional uncertainty in the final Empirical Bayes estimates.
Back to top
View user's profile Send private message
gabefarkas
Joined: 31 Dec 2004
Posts: 1313
Location: Durham, NC
PostPosted: Fri Jun 19, 2009 7:42 am Post subject: Reply with quote
Ryan J. Parker wrote:
Yeah it was just a reference. Sad
Oh, and nothing is too old for JSTOR!
http://www.jstor.org/pss/2530245
Yup, that's it. I can't tell if you can get the full article or not, but if you can it's definitely worth reading.
Back to top
View user's profile Send private message Send e-mail AIM Address
tpryan
Joined: 11 Feb 2005
Posts: 100
PostPosted: Sun Jun 21, 2009 4:16 am Post subject: Reply with quote
Of course what Gabe was saying is that if many tests are performed, a "significant" result or two could be obtained due to chance alone. There is not a simple solution to that problem, in general, and adjusting alpha levels for individual tests can be too conservative.
I am not a Bayesian, nor an expert on it, but I tend to agree with Gelman regarding terminology. One starts with a reasonable prior, maybe even a noninformative prior, then combines that with data to obtain the posterior. Systems change over time, as George Box has emphasized, prefering to think of rapid change, so the posterior becomes the next prior, then posterior_2 is produced from more data, etc.
"EBay". Clever. Very Happy
Back to top
View user's profile Send private message
mtamada
Joined: 28 Jan 2005
Posts: 377
PostPosted: Mon Jul 20, 2009 3:47 pm Post subject: Reply with quote
Back to the video-based data capture that started this thread: Sportsvision (the same company that brings you those yellow virtual first-down lines on TV football broadcasts, as well as baseball's Pitch F/X data) recently unveiled the next generation beyond Pitch F/X: tracking and timing of balls and players.
The prototype system is in place in San Francisco (I refuse to even attempt to keep up with the commercially-based name changes of ballparks and arenas, it was originally called PacBell Field). They recently had an all-day mini-conference in San Francisco to talk about the logistics and ins and outs of this technology. There was even a presentation about creating "heat maps" of Pitch F/X data (rather than scatterplots), which sounds similar to the colorful shot charts recently discussed here.
Although the nature and flow of basketball games are very different from those in baseball, I hope that the NBA and MLB and their contractors are communicating and cooperating; this is all new stuff and rather than independently re-inventing the wheel, I think all of the sports and technologists could probably learn a lot about new techniques and best practices from each other. The NBA's upcoming system has been described as being provided by STATS LLC, but I don't know if they're literally doing the hardware, technology, etc., I think of them as being a data company rather than a technology company. Maybe STATS is already partnering with Sportsivision? Sportsvision's website says that they are the source of Hoops F/X data, evidently used by TV broadcasters. Did any NBA reps attend the Pitch F/X mini-conference (which evidently was open to anybody, all it lacked was publicity)?
Additional hopes for the future: whatever the NBA and STATS end up calling their 6-HD-camera setup ("Hoopsvision"?), I hope they make the data publicly available and organize conferences (or participate in existing ones such as Sloan, NESSIS, or NCSSORS). At NCSSORS, someone mentioned the reams of data that the NFL has -- but doesn't share. I think that's a mistake on the NFL's part, an exampe of 20th century thinking. Yes it cost them probably millions of dollar to create and collect those data, but by only sharing it within the NFL (or licensing the data for a very high price) they limit the amount of research that can utilize the data. 21st century thinking would tell them to make the data freely available; there are literally hundreds if not thousands of fans and would-be analysts who would love nothing more than to jump on those data and start doing analysis -- all for free. If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!)
Back to top
View user's profile Send private message
HoopStudies
Joined: 30 Dec 2004
Posts: 705
Location: Near Philadelphia, PA
PostPosted: Mon Jul 20, 2009 4:45 pm Post subject: Reply with quote
mtamada wrote:
...If the NBA and STATS make their Hoopsvision data freely available, somewhere out there is the next Dean Oliver who'll make some revolutionary findings with the data. (Or come to think of it, the original DeanO is still around too!)
And, yes, I know what to do with the data. Definitely a good challenge, bringing every ounce of PhD training I got.
_________________
Dean Oliver
Author, Basketball on Paper
The postings are my own & don't necess represent positions, strategies or opinions of employers.
Back to top
View user's profile Send private message Visit poster's website
Crow
Joined: 20 Jan 2009
Posts: 815
PostPosted: Mon Jul 20, 2009 8:39 pm Post subject: Reply with quote
Detailed video translated into a multi-factor database would get at the situational FG%s of mid-range shots- open or degree contested along with time of shot clock and perhaps catch n shoot versus off the dribble. That would aid the management / reduction of mid-range shots.
Ideally you could use such a database to look at play sequences and try to find optimized sequences for your team vs different team types / lineup mixes and defensive schemes (based perhaps largely on where you get shot a and expected payoff instead of actual?), using the mid-range as a part of overall strategy, to the extent that you normally have to and not beyond that. In chess often the masters think in what 10 or 20 move sequences? Do the best NBA coaches?
And going beyond sequences you could usefully examine plays and how the swirl of motion and player attributes in that motion with their potentialities lead to more or less open and good shots. And then try to repeat the most successful plays and the critical pieces of plays precisely. If the cameras are fixed you could compare a successful, pretty play to other real game versions of it down to inches or practice it until it sufficiently fits the pattern.