Re: Garbage Time
Posted: Thu Dec 31, 2020 7:46 pm
Analysis of basketball through objective evidence
http://www.apbr.org/metrics/
Yes....and No
I would be more interested in evaluating the performance of model A vs model B. This is very easy to evaluate. Just run an ANOVA between them. There are other ways to evaluate performance as well, but it's a fairly straight-forward process. It is not an ill-formed problem to solve either. We can predict at time t=37 what the outcome of the game will be at time t=48, and the outcome is well-defined (win or loss). Competing models that predict these outcomes will have varying degrees of performance. A model that is more performant than another (yields more accurate predictions) will have more area underneath the ROC curve than the competing model. It is in this sense that I would state with confidence that one model's predictions are "better" than the others. An outcome for time t=37 is not necessary. We are only interested in the outcome at time t=48.
rainmantrail wrote: ↑Fri Jan 01, 2021 3:12 amI would be more interested in evaluating the performance of model A vs model B. This is very easy to evaluate. Just run an ANOVA between them. There are other ways to evaluate performance as well, but it's a fairly straight-forward process. It is not an ill-formed problem to solve either. We can predict at time t=37 what the outcome of the game will be at time t=48, and the outcome is well-defined (win or loss). Competing models that predict these outcomes will have varying degrees of performance. A model that is more performant than another (yields more accurate predictions) will have more area underneath the ROC curve than the competing model. It is in this sense that I would state with confidence that one model's predictions are "better" than the others. An outcome for time t=37 is not necessary. We are only interested in the outcome at time t=48.
It seems to me as though you are interested in solving a different problem than the one I'm stating. The question that I'm aiming to answer with my models is "what is team A's probability of winning the game, given the current state?". By definition, I am interested in the outcome of the game.vzografos wrote: ↑Fri Jan 01, 2021 3:44 am
I really disagree with this way of thinking (i.e. evaluate temporal performance only w.r.t. the final outcome). You can always construct counterexamples of models that, acording to your evaluation logic, can have a high prediction accuracy at t=48. Imagine a trivial model that predicts 0.5 from t=0 until t=46 and at t=47 it predicts the outcome purely on the point differential on the last second.
Or take any example of a a game with an outcome that flips on the last 2 seconds (high volatility of the prediction probabilities in those last few seconds). How is the prediction of your model up until that point even relevant?
Define that please.
I just mean that I want my model's output to be somewhat close to what has happened historically for other games with the same number of minutes remaining and the same point spreads. So, if the model sees that tonight's game between HOU and SAC where HOU is ahead by 3 points with 1:30 remaining, and it predicts that HOU is 86% to win, then I'd want to know that if I were to look up all previous games in my database where a team had a 3 point lead with 1:30 remaining, that approximately 86% of them (+/- some small margin of error) indeed won the game. If it turned out that only 55% ended up winning, I'd consider my model to be a failure. If it turned out that 83.7% ended up winning, I'd say it's directionally accurate, and good enough for my use case. If I were betting on games with it though, then I'd probably want something better than that 83.7% performance. But that isn't the goal of this model.
OK, sounds good. We can discuss this topic via PM so that we don't force everyone else to read through the congestion.vzografos wrote: ↑Fri Jan 01, 2021 4:40 am
No I am also interested in a part of that problem. Maybe different aspect of it. But I have second thoughts about the determination of accuracy at a specific point in time, given the absense of ground truth data (i.e. imagine if you like a ground truth temporal curve which we dont have to compare with at every time t).
In any case. To avoid this thread draggin on I ll stop here but I will think about this problem and maybe come back to it in the future.
ok you meant calibrated. Understoodrainmantrail wrote: ↑Fri Jan 01, 2021 5:35 am
I just mean that I want my model's output to be somewhat close to what has happened historically for other games with the same number of minutes remaining and the same point spreads. So, if the model sees that tonight's game between HOU and SAC where HOU is ahead by 3 points with 1:30 remaining, and it predicts that HOU is 86% to win, then I'd want to know that if I were to look up all previous games in my database where a team had a 3 point lead with 1:30 remaining, that approximately 86% of them (+/- some small margin of error) indeed won the game. If it turned out that only 55% ended up winning, I'd consider my model to be a failure. If it turned out that 83.7% ended up winning, I'd say it's directionally accurate, and good enough for my use case. If I were betting on games with it though, then I'd probably want something better than that 83.7% performance. But that isn't the goal of this model.