vzografos wrote: ↑Mon Dec 28, 2020 7:28 pm
rainmantrail wrote: ↑Sun Dec 27, 2020 3:04 am
I'm working on some of my NBA models and decided to revisit the garbage time weighting component for each possession. I had previously run a logistic regression model to estimate a team's chances of winning the game based on time remaining, point differential, and the number of starters on the floor (and some interaction terms) as my way of assigning numeric values to each play. As you can see from the ROC curve below, the model performs well, but there's a key problem with this approach. The iid assumptions that the logistic regression model makes are violated in practice. When teams are ahead, they don't continue to press their advantage by running at full strength, thus violating the iid assumption. This renders the model's win percentage predictions invalid. It also just doesn't align with how coaches make decisions about who is on the floor at any given time. I decided to rework this analysis and use something that aligns more with actual coaching decisions. I took my PBP data from the past 7 seasons and plotted the number of starters on the floor against the point differential for each quarter. The results were pretty interesting, so I thought I would share them in case anyone here finds this interesting or useful. Here are the results below, as well as the ROC plot of my previous logistic regression model's performance.
I didnt understand some parts of your description (as it is often the case when trying to explain something in a discussion board).
Questions: How did you evaluate the ROC exactly?
Always my problem with live prediction models (i.e. time remaining based models) is how do you evaluate performance between two models that give different predictions. Say at time t, one model gives 78% win chance and the other 67% win chance. I would really like to know about this.
In your box plots. What is on the y-axis? Number of starters on the floor?. What does this mean exactly? how many players that started the game are still on the floor at any given time? I dont think I understand this because you are quoting more than 5 players. Can you explain?
Apologies for the not-so-well-explained post. Hopefully this clarifies what I'm working on a bit better.
My goal is to weight possessions in my play-by-play database by how important they are. At the end of games, we often see both teams running with their backups on the court while the starters sit on the bench when the game is essentially already over (e.g., up by 27 with 2:30 remaining in the 4th quarter). I wanted a way to determine how "important" each possession is. Initially, my approach was to build a logistic regression model that would yield the probability of a team winning the game based on the current state of the game (e.g., up by 17 with 7:00 left in the 3rd quarter and each team has 3 of their starters on the court = 86% chance of winning, or whatever the number is). I was able to build this model pretty easily, and the outputs made sense from a probability standpoint. I trained the model on a subset of the data and tested the results on a held out testing dataset. I created the ROC curve by evaluating the model's predicted winners against the actual outcomes of that game. The input variables to that model were 'Minutes_Remaining', 'Score_Differential', and 'Starters_on_Court', with the 'Starters_on_Court' being the total number of starters from both teams who were on the court for a given possession (the min of which is 0, the max of which would be 10).
This approach "worked" somewhat, but there is a fundamental flaw with it. Logistic Regression models assume that each observation in the dataset (each possession in this case) is independent and identically distributed (iid). The independence assumption is clearly violated since there is a lot of overlap from possessions within the same game, but this violation isn't really all that concerning, as there is enough variation across a full season that it mostly works itself out. However, the assumption of the possessions being identically distributed is a huge problem. The reason this is problematic is because teams make coaching decisions based on the state of the game and adjust accordingly. In other words, when the logistic regression model projects something as an 80% probability of winning, what it is actually saying is "if the lineups don't change, then team X has an 80% probability of winning". But as soon as substitutions are made, and a team puts their entire starting lineup back on the floor, then the model's prior probability of winning is no longer valid. Same thing in the opposite direction where if they had starters on the floor and then decided to sit them all. The model would change. The other thing that I noticed is that coaching decisions don't even remotely follow these likelihoods of winning at any given point throughout the game. In the 2nd and 3rd quarters particularly, teams will often run at full strength even when the point differential is very large (for example, just this week, Dallas was still running Luka Doncic even though they were up by 50 points in the 2nd quarter).
What I really want to accomplish is to be able to flag "garbage time" possessions and to weight those less than possessions where each team is playing to win, and doing so at full strength. I'm defining "garbage time" as the possessions in a game where the game is effectively over (say 3 minutes remaining and up by 32 points), and teams are running with their weakest lineups in an attempt to get rookies and scrubs some experience. I wanted my weights to follow how coaches actually make these lineup decisions rather than using the probability of winning estimates that are output by my logistic regression model. That's why I decided to look at how many starters were on the court in different scenarios. I also looked at the number of starters on the court with respect to time remaining and by point differentials, but surprisingly, the time remaining component didn't really affect these decisions nearly as much as I thought it would. Coaches generally don't "throw in the towel" until well into the 4th quarter, even if they're down by 30+ points, and the same is true with teams who have the lead.
The box plots I posted show how many combined starters from both teams (max of 10) are on the court vs how large the score differential is for each quarter. This gives me a pretty good idea of how coaches make decisions about their lineup strengths when their up by 5 points, 15 points, or 25 points in the 4th quarter. I can use this information to reweight each possession in my play-by-play database for when I'm doing RAPM (and similar) calculations. I can also give additional weights to possessions where the game is really close late in the 4th quarter or in OT if I want. This should help the RAPM calculations to overlook, or at least significantly devalue, possessions where Karl Anthony Towns is tearing it up against the Lakers practice squad while Lebron and AD are sitting on the bench because they're up by 36 points in the 4th quarter. It helps to keep the player's RAPM values more "honest", and is a better reflection of their true abilities.