Basketball Play-By-Play Analysis Using EventFlow

Home for all your discussion of basketball statistical analysis.
Post Reply
Madey Jay
Posts: 8
Joined: Sat May 25, 2013 4:36 pm

Basketball Play-By-Play Analysis Using EventFlow

Post by Madey Jay »

Hey All,

We're trying to get some expert feedback on the potential of using event sequence visualization techniques to analyze play-by-play stats. Basically, I've taken a software tool that was designed to analyze medical datasets, and rewired it to do the same analytics on play-by-play stats. This project is coming out of a computer science department, so suffice to say, there aren't many basketball experts to bounce this off of. The work I've done is summed up in a 10-minute video that you can see here:

https://vimeo.com/66965934

Ideas? Feedback? This has potential? This is stupid?

I'll take whatever you've got.

Cheers,
Megan
AcrossTheCourt
Posts: 237
Joined: Sat Feb 16, 2013 11:56 am

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by AcrossTheCourt »

I think this is a really interesting look at play-by-play data.

The best use is probably looking at seasonal aggregated data for a different and more detailed approach to team analysis. For instance, looking at the Pacers' data you could see how much of their offense stems from offensive rebounding, and you can specifically see what happens after an offensive rebound and how often.

Another point of attack is looking at what happens in the transition from rebound to offense. Specifically, one area of contention is offensive rebounding versus transition defense. This would a nice visual way of looking at that problem.
EvanZ
Posts: 912
Joined: Thu Apr 14, 2011 10:41 pm
Location: The City
Contact:

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by EvanZ »

Megan, this is really cool stuff.

I have only parsed NBA data, but one issue I always run into is that inferring shot clock from game clock is fairly "buggy". In the NBA, for example, on a made shot, the game clock starts winding down before the shot clock, so shot clock times can be off by a few seconds. Not sure about the college play-by-play, as I haven't looked at it, and actually don't know the rules. Anyway, thought I would mention it.

I hope you try to present this at the Sloan conference at MIT next year.
Mike G
Posts: 6144
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by Mike G »

Megan, congratulations on being forced by circumstances into appreciating the greatest game on Earth.
I watched your awesome video, and I was thinking when you said you cannot get these insights (Md vs NC game) from merely looking at the box score, that it's clear Maryland had an offensive rebounding advantage. This is a box-score stat.

A team plays to its strengths, and if you have a quick, open-court advantage, you go for the first available good shot, often a fast-break, and you may not have offensive rebounders in position.
If you aren't so quick, you set up, work for the best available shot, and crash the boards.

Do you know that at the end you named Dean Oliver's "Four Factors"?
- Effective FG% (count 3-pointers as 1.5 made FG)
- Offensive Rebounds (as a % of available rebounds; from which defensive rebound % may be imputed)
- Turnovers (as a % of all possessions)
- Free Throws -- normally expressed as FT/FGA

http://www.basketball-reference.com/lea ... _2013.html
See the Miscellaneous Stats
kpascual
Posts: 50
Joined: Thu Mar 01, 2012 7:02 pm

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by kpascual »

Hi Megan,

That's a nice tool you've created. There are a few sites that have attempted to do play by play visualization, like NBAGraphs, Popcorn Machine. I've tried myself. I think the challenge with this kind of visualization is representing the game events in an easy-to-understand manner. The color-legend functionally works, but IMO doesn't feel like the optimal representation because there are more game event types than there are distinct, differentiable colors in the color spectrum.

What's the primary goal of the visualization? It seems much like a decision tree in how it sorts and aggregates the data... it's pretty nifty. It feels more like a sorting mechanism and less a predictive one, but I'm curious on your perspective.
Madey Jay
Posts: 8
Joined: Sat May 25, 2013 4:36 pm

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by Madey Jay »

Thanks everyone, for your responses. This has all been incredibly helpful!

AcrossTheCourt - I did a little experiment to look at how much of the Pacers' offense stems from offensive rebounds. I posted it here if you want to take a look:

https://vimeo.com/67086795

I'll take a gander at the other point you mentioned later this week. Do teams strategically crash the boards during certain periods of the game and lay off during others? Or do they tend to either crash or not crash? Obviously there's no direct indication in the play-by-play stats as to whether the team is crashing the boards.

EvanZ - Thanks for the tip. I'll take a closer look at some actual game tape to get a better idea of what the difference is. We're definitely gearing up for a Sloan submission though.

Mike G - What I thought was interesting about this game was that, when Maryland transitioned onto offense from a UNC missed shot (and a Maryland rebound), the visualization of the Maryland offense looked almost identical to UNC's. It was only the made shot transitions where there was a noticeable difference. But yes, you're absolutely right, the offensive rebound stat is definitely skewed towards Maryland in the box score.

I read Dean Oliver's book when I first started working on these datasets. It seemed like the best way to start to learn what to look for.

kpascual - The goal of the visualization, at least in the medical domain, is to generate new questions and hypotheses about the dataset. When you have a specific question that will have a specific answer, it's very effective to apply statistics directly to the data. When your questions are more open-ended though, visualizations allow you see both expected and surprising patterns. It also allows you to see the data as you manipulate it. Particularly with querying, this can make a huge difference in whether queries actually end up doing what users think they're doing. Finally, it serves as a good tool to communicate particular findings. People process visual information much faster that text if it's presented cleanly. All in all, the application is designed to make this type of data more accessible to non-statistical/non-computer-science users. I guess in this case that would be the coaches.
wilq
Posts: 80
Joined: Fri Apr 15, 2011 4:05 pm
Location: Poland
Contact:

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by wilq »

Interesting tool Megan - and I loved how you immediately viewed basketball through your unique point of view - but as far as I understand EventFlow from your videos there are better uses than visualization [*] - specifically sequencing itself and your ability to quickly filter through large data set!

[*] - to be fair, you probably see there more than others and I'm also biased toward certain point of view...

Based on example with Indiana, it took you seconds to find events after another for one team so it will take you a couple of minutes to do the same for all teams and let's say last 5-10 years, right?

Well, based on that you would have multiple IMHO interesting articles/posts/papers starting with comparing offensive efficiency after each starting event [with steals or OR probably the best and missed free throws probably the worst], offense after timeouts[!] and what it tells us about the coaches, fast breaks after threes, getting back on D vs crashing the offensive boards, how coaches deal with foul trouble, more realistic blocks' value, end of the game management etc.
Some of those topics were covered but I think you could take it to the next level at least for the general public.
Madey Jay wrote:This project is coming out of a computer science department, so suffice to say, there aren't many basketball experts to bounce this off of
Wait a second, don't most people interested in the basketball data have some kind of computer science background? Because if you already spend days with computers/data you may as well apply it to your hobby...
Mike G wrote:Megan, congratulations on being forced by circumstances into appreciating the greatest game on Earth.
+1.
bibzzzz
Posts: 2
Joined: Mon May 06, 2013 12:10 am

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by bibzzzz »

Hi Megan,

(and everyone else - i'm new here)

That was a fascinating talk. I really enjoyed some of the points you made, and the tool looks fantastic. Did you have to partition the possessions yourself before putting it through eventflow? I've been trying to break down games by possessions lately, and I feel as though there is a lot of interesting analysis that can arise by analysing the game at a possession level.

Anyways, great work and I think a lot of people will be expecting to hear some good news.
Madey Jay
Posts: 8
Joined: Sat May 25, 2013 4:36 pm

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by Madey Jay »

hey all - sorry for the delayed response - i was out of town for the weekend.

wiLQ - the visualization we use definitely takes some time to get used to. however, once you do get used to it, it is extremely effective for helping users notice patterns and isolate subgroups. some of our current users can now see these things faster than i can.

i'd definitely like to look at this play-by-play data from a number of different angles. the problem is that i really don't have the basketball expertise yet to know what questions to ask. one of the reasons i posted this was to get suggestions for potential analyses - what's been done before vs. what would really be helpful. what we're really hoping for is to find a coach (probably a college coach) who would be willing to ping questions off of us throughout the season - questions that could result in actionable strategies - and then just let us nerd out with EventFlow to find them the answers.

bibzzzz - i did partition the possessions myself, with a separate script that parses the actual web pages. it's actually pretty effective. the main source of errors are entry errors on the ESPN side, and weird differences between the way that college vs. professional games are recorded.

i should also mention that EventFlow is not "officially" publicly available, but it is available by request. if anyone would like to try this out themselves, feel free to e-mail me at madeyjay@umd.edu. i'd be happy to whip up a custom dataset for you to test out as well, so long as it didn't require too much extra coding.
wilq
Posts: 80
Joined: Fri Apr 15, 2011 4:05 pm
Location: Poland
Contact:

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by wilq »

Madey Jay wrote:i'd definitely like to look at this play-by-play data from a number of different angles. the problem is that i really don't have the basketball expertise yet to know what questions to ask. one of the reasons i posted this was to get suggestions for potential analyses - what's been done before vs. what would really be helpful.[...]
i'd be happy to whip up a custom dataset for you to test out as well, so long as it didn't require too much extra coding.
Does "5-10 years of play-by-play data" count as too much extra coding? ;-)
Because if you are not interested in following topics I've mentioned above I'd love to do it myself.
Madey Jay
Posts: 8
Joined: Sat May 25, 2013 4:36 pm

Re: Basketball Play-By-Play Analysis Using EventFlow

Post by Madey Jay »

haha - i KNEW you were going to say that! i'll take a look at that tomorrow. it actually might be pretty simple. i'll have to strip down some of the attributes so that memory won't be an issue, but i actually think it'll be a good scalability challenge. drop me an e-mail, and i'll see what i can do to hook you up.
Post Reply