Big data space / movement analysis

Home for all your discussion of basketball statistical analysis.
Post Reply
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Big data space / movement analysis

Post by Crow »

I know the team with Kirk Goldsberry did an analysis of expected overall value of possession touch by touch. But has anyone done or contemplated a comprehensive space and movement summary and analysis of the player / teammate / opponent spacing and movement details between the start of the last pass or dribble and the play's final action? On average when the ball is in a certain zone of the floor (start with 15ish?) before that final action what are say the 5-20 best cluster average representations for how the players are arrayed and moving (direction, speed, foot / torso and head orientation if the data provides)? Then with these 75-300 sets of before and final pictures, calculate measurements of all the spatial and movement relationships in each and in the change including realistic opponent pass and help defense intercept distances and times for the set of 30 possible pass or drive options (15 zones to end up in * 2 ways, or at least the five apparently most desirable and "feasible" based on coaching and player instincts). Then by detailed study of the data identify the actual most desirable choices and the average productivity of each sides choices (overall and vs every available and / or likely opponent alternative). How far along are the best teams along on this? When do they get to a digestable set of recommendations for these 75-300 before the final action scenarios (with 750-9,000 decision options) ? How different would the recommendations be from the average decision splits? How saleable are the recommended changes? If you took this to the extreme you'd have 75-300 scenarios for ball in every zone before final action for each of PGs, wings, maybe swings and bigs or every player individually. But might be best to simplify some then build up to the full set. Then the way to get to the desired zones with the more desirable spatial cluster appearances and best average action choice / matchup results would be the next stage of analysis.
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Big data space / movement analysis

Post by Crow »

Putting it a bit differently, one end goal would be to be able to tell a player if you are in a certain zone with the ball and the player configuration looks like (or that, that, that, etc.) consider doing tge best looking of x, y or z BASED ON Large Scale HISTORIC DATA. And not just based on imperfect recent memory or still imperfect lifetime based instinct. Another goal would to be able to say to get from ball here to good shot there (openness, distance and style of shot) the historical data says the best way is to drive this kind of way (direction, speed, changes in each) or make this kind of cut and pass. This is what players and coaches should want, demand. Context based play advice. Overall stats and splits are not sufficient. Take it to a place, moment and context on the court.
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: Big data space / movement analysis

Post by Nate »

Yeah, The Harvard XY group did that years ago with the SportsVU data and presented at Sloan.

http://www.sloansportsconference.com/wp ... l-Time.pdf
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Big data space / movement analysis

Post by Crow »

They did what I said in the first sentence. They have the data set to do the rest. Maybe they have done more as consultants. But they have not to my knowledge presented anything like what I outlined.
steveshea
Posts: 91
Joined: Wed Oct 23, 2013 8:17 pm

Re: Big data space / movement analysis

Post by steveshea »

Chris Baker and I have been looking at this. Here are some quick thoughts based on what we've encountered.

1. I like the suggestion in this thread to study the action/movements/formation of everyone based on the eventual success of the possession. Basketball is not a 1-step Markov process in the sense that we can judge each decision (pass, drive, etc) independent of what has happened before and more importantly, what will happen after. For example, we might see a guard on one wing swing the ball to a big man at the top of the key. That pass doesn't look great by itself. The big man isn't a threat to knock down the 3, and he's not in a good rebounding position if a shot goes up. However, if the play is for that big man to then hit Steph Curry coming off a screen on the other wing and to have a running start for the ORB (or to be in good transition D position), then the pass to the big man at the top of the key was a fine one.

2. Computing power is a major obstacle to this type of study.

3. The personnel involved can't be ignored. The major players on the best teams (Curry, LeBron, Durant, AD etc) can have extreme impacts on the shape of a defense. It might be that each of these players needs to be in a cluster of his own.
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: Big data space / movement analysis

Post by Nate »

2. Computing power is a major obstacle to this type of study.
I'd think data availability is a bigger issue. Consider, for example the SportVU data set: Estimating 100 bytes per sample, 25 samples per second, 60 seconds per minute, 48 minutes per game, 41 home games per team per season, 31 NBA teams works out to roughly 9 Gigabytes for the regular season. Objectively, that's a lot of data, but it's small enough to fit on many people's phones or keychains these days.
bchaikin
Posts: 307
Joined: Thu May 12, 2011 2:09 am

Re: Big data space / movement analysis

Post by bchaikin »

Basketball is not a... in the sense that we can judge each decision (pass, drive, etc) independent of what has happened before and more importantly, what will happen after.

we can't? why not?...

For example, we might see a guard on one wing swing the ball to a big man at the top of the key. That pass doesn't look great by itself. The big man isn't a threat to knock down the 3, and he's not in a good rebounding position if a shot goes up. However, if the play is for that big man to then hit Steph Curry coming off a screen on the other wing and to have a running start for the ORB (or to be in good transition D position), then the pass to the big man at the top of the key was a fine one.

teams run the same plays and make the same or similar passes hundreds of times a season. sometimes the results are the same but many times they are not - at any time a player receiving a pass can break off a designated play and instead of passing to the designed next pass recipient instead shoot, or dribble, or pass elsewhere, all predicated on who is on the floor, both on offense and defense, who is covering who on defense, etc., or even on how bored the ball handler is of running the same play who just wants to "mix things up". this happens all the time...

consequently you can model these as independent actions...
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Big data space / movement analysis

Post by Crow »

The action captured by the final -1 touch and final action cluster pictures might be considered the last movement of generic plays, with everything before it prologue. Prologue that can vary but still needs to the same basic ending. If 20 per zone is not enough, do more.
steveshea
Posts: 91
Joined: Wed Oct 23, 2013 8:17 pm

Re: Big data space / movement analysis

Post by steveshea »

bchaikin wrote:Basketball is not a... in the sense that we can judge each decision (pass, drive, etc) independent of what has happened before and more importantly, what will happen after.

we can't? why not?...

For example, we might see a guard on one wing swing the ball to a big man at the top of the key. That pass doesn't look great by itself. The big man isn't a threat to knock down the 3, and he's not in a good rebounding position if a shot goes up. However, if the play is for that big man to then hit Steph Curry coming off a screen on the other wing and to have a running start for the ORB (or to be in good transition D position), then the pass to the big man at the top of the key was a fine one.

teams run the same plays and make the same or similar passes hundreds of times a season. sometimes the results are the same but many times they are not - at any time a player receiving a pass can break off a designated play and instead of passing to the designed next pass recipient instead shoot, or dribble, or pass elsewhere, all predicated on who is on the floor, both on offense and defense, who is covering who on defense, etc., or even on how bored the ball handler is of running the same play who just wants to "mix things up". this happens all the time...

consequently you can model these as independent actions...
I tend to see basketball possessions as a more complicated series of actions (screens, passes, movement, drives, etc). Let me try to give another example. Steph Curry brings up the ball. He could pull up for a contested 3. For Curry, that's still a reasonable shot. Instead, he chooses to pass to Livingston on the wing. Season numbers suggest that the GSW offense scores a lower per possession rate when Livingston gets the ball there than when Curry pulls up for a contested 3. If we judge Curry's pass independently, we'd say Curry hurt his team with that pass. However, the team has drawn up a new play for Curry to then set a screen, receive a screen and pop out (or they've noticed something in the defense that suggested an old play would be more effective this time). Livingston passes to Curry for an open catch and shoot 3 in the corner. Season numbers suggest that Livingston's pass significantly improved the team's chances of scoring. If we judge each action independently, Curry's pass had a negative impact, and Livingston's pass had a very positive impact. I'd prefer to see the rest of the play before I say Curry hurt his team by passing to Livingston.

Yes, sometimes players make their own independent decision. If Bogut decides to throw up a 26 footer with 18 seconds on the shot clock, we're not going to fault the player that passed to him. Sometimes, decisions are independent. Other times, passes/screens/drives are made with an understanding that the team is involved in a collective effort to score, that a successful possession might depend on several players successfully completing a series of well-timed actions.

I see your logic that if a team's offense is incredibly consistent, we can pull the interdependence of actions into the information of the current state and model it as Markovian. (Technically speaking, your version would be a finitary image of my model.) However, with so many variables (lineup on the court, player locations, defensive formation and ability, score and time in game, etc) sample sizes on truly comparable past situations can be small, leaving future predictions unreliable. Furthermore, sometimes teams just change (in the most obvious way from a major injury or trade).

In summary, I'd prefer to use a finitarily Markovian model than a Markovian model.
bchaikin
Posts: 307
Joined: Thu May 12, 2011 2:09 am

Re: Big data space / movement analysis

Post by bchaikin »

However, with so many variables (lineup on the court, player locations, defensive formation and ability, score and time in game, etc) sample sizes on truly comparable past situations can be small, leaving future predictions unreliable.

you say this based on what? have you modeled this and found the predictions are in fact unreliable, and are thus basing this statement on experimental/modeling results? or are you simply guessing because you currently cannot model it?...

if you want a model that perfectly reproduces everything (every player movement), that will happen relatively soon based on SportVu data...

but if you want a model that gives reliable predictions to player and team stats, team W-L records and game results, you don't have to model every single player's actions at every single moment...

in baseball what does the R fielder do when a ball is hit to the L fielder? the vast majority of the time nothing, he just stands there. does that have to be modeled to accurately model the game of baseball in terms of reproducing batting, pitching, fielding, team stats and record? no...

now if the R fielder is not there, and the team has just 2 outfielders, then that's a different story, because the batter may alter his decision on how to hit the ball...

but the fact is that most times (not all but most) a ball is hit to the L fielder, you can leave the R fielder's actions out of the equation. it's the same in basketball - at some point all 10 players' movements on the floor will be modeled, but just because we cannot yet does not mean that a statistically accurate model based on player and team stats giving reliable predictions cannot be designed, one that takes into account offensive and defensive lineups on the court, defensive ability, score and time in a game, etc...
steveshea
Posts: 91
Joined: Wed Oct 23, 2013 8:17 pm

Re: Big data space / movement analysis

Post by steveshea »

Trying to model the value of players' micro moves using sportvu coordinates is exactly what Chris and I have been doing (and I thought what this thread was about). We are very much interested in, for example, whether stationing a shooter in the weakside corner 3 keeps a help defender out of the lane and improves player X's driving efficiency.
Last edited by steveshea on Sun Nov 22, 2015 1:47 am, edited 1 time in total.
steveshea
Posts: 91
Joined: Wed Oct 23, 2013 8:17 pm

Re: Big data space / movement analysis

Post by steveshea »

Nate wrote:
2. Computing power is a major obstacle to this type of study.
I'd think data availability is a bigger issue. Consider, for example the SportVU data set: Estimating 100 bytes per sample, 25 samples per second, 60 seconds per minute, 48 minutes per game, 41 home games per team per season, 31 NBA teams works out to roughly 9 Gigabytes for the regular season. Objectively, that's a lot of data, but it's small enough to fit on many people's phones or keychains these days.

Data storage isn't the issue. It's the queries on that data.
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: Big data space / movement analysis

Post by Nate »

steveshea wrote:...

Data storage isn't the issue. It's the queries on that data.
Right, you want to build some kind of model (effectively this is a lossy compression of the data) and then query the model instead of going against the full data set.
...
Basketball is not a 1-step Markov process in the sense that we can judge each decision (pass, drive, etc) independent of what has happened before and more importantly, what will happen after.
...
It seems very much like you're letting the perfect become the enemy of the good. (In practice, 'explain everything' approaches tend to overfit the data anyway but that's a more technical issue.)
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Big data space / movement analysis

Post by Crow »

If you did such a study of player array before last touch to last action it might be interesting and potentially valuable to track player left or right hand dominance in the data set. How much does that affect efficiency of last action or the threat levels before the last touch? By each player and the specific combinations of handedness and locations?

How much does left or right hand dominance affect vision- in general and by court position? Front, left and right peripheral and depth. Any? A lot? Is this topic well developed in sports and specifically in NBA? From medical studies and on court studies. What is seen, what is processed accurately and how fast are decisions made? On offense and defense. Do player have significant foot dominance? Related to handedness usually or pretty independent? How much do coaches think about this and customize / optimize plays for hand, eye, foot dominance? From looking at thousands of similar game situations what combinations of physical dominant traits and player movements yield better results? Is soccer / football and baseball way ahead on this? Boxing?
Post Reply