APBRmetrics

Posted: **Mon Dec 14, 2020 11:49 pm**

rainmantrail wrote: ↑Mon Dec 14, 2020 10:12 pm Another design option would be to have each player going up against the average defender in these additional rows, and have them defending against the average offensive player for their defensive priors.

Interesting. If your priors come from a specific season (say last season) then I would suggest using the average defender/offsensive player of THAT particular season.

Also interesting
https://www.sloansportsconference.com/p ... hardt-lane
I didnt know about him. I guess now he is a trader
https://www.3redpartners.com/leadership/
I suppose a lot of sports betting market inefficiencies are now being exploited by quants.

Posted: **Tue Dec 15, 2020 3:19 am**

vzografos wrote: ↑Mon Dec 14, 2020 11:49 pm Also interesting
https://www.sloansportsconference.com/p ... hardt-lane
I didnt know about him. I guess now he is a trader

Apparently he's also an associate professor now at NYU as well. A friend of mine is enrolled in the data science MS program there and he's teaching a course on predictive modeling in sports next semester. He was also working for some MLB team last I heard.

Posted: **Tue Dec 15, 2020 11:10 am**

And that's why you should stay away from sportsbooks. Only try open exchanges (if they exist in the US which I doubt)

Posted: **Tue Dec 15, 2020 12:26 pm**

If you use the "additional rows" method, the statistical shrinkage may be at cross-purposes with the prior. For instance--the low minutes players will be dominated by the prior and the shrinkage. The prior toward something low--maybe around -3.5 per 100, and the shrinkage pulling toward 0.

How would you determine how many "observations" of each prior is the best? There are two unknowns, then--what is the prior, and how many minutes do you use for it? Or do you use the same minutes for all players?

Posted: **Mon Dec 21, 2020 4:43 am**

DSMok1 wrote: ↑Tue Dec 15, 2020 12:26 pm If you use the "additional rows" method, the statistical shrinkage may be at cross-purposes with the prior. For instance--the low minutes players will be dominated by the prior and the shrinkage. The prior toward something low--maybe around -3.5 per 100, and the shrinkage pulling toward 0.

How would you determine how many "observations" of each prior is the best? There are two unknowns, then--what is the prior, and how many minutes do you use for it? Or do you use the same minutes for all players?

I get what you're saying. Functionally, the regularization shrinkage acts as a Bayesian prior with a Gaussian distribution centered around zero. But we can minimize this effect with a sufficiently large enough sample for each player. The balance of how many possessions to award the priors in this framework might be difficult to optimize in any way other than brute force. But I would lean toward awarding the same number of 'prior possessions' as the weights for each player coming into a new season. If we were less confident about someone's priors though, we may want to lessen the weight of their priors. Hmm. But then, we might just effectively end up with a multi-year RAPM model. But we could still tinker with the weights to improve it's performance. As well as accounting for other elements. Ultimately, I plan to just try different values and see which ones are more predictive. Same with measuring them against the other approach where you remove the value of their priors from the target variable instead, then add them back in after the regularization. The best way to know which is better is to try them out and see which is more predictive. The problem with the subtract then add back in approach is that you can't weight the priors.

Posted: **Mon Dec 21, 2020 12:21 pm**

When I run predictions, I have found that I generally get best results using 2 priors. One is a general prior, of around -3 (or a little lower for rookies) and then a second prior based upon Team efficiency + MPG or something like that. That second prior is only used if players have a decent sample on a team.

Those two could be merged for a more general application.

Posted: **Mon Dec 21, 2020 2:53 pm**

all that manual tweaking....

Posted: **Sat Dec 26, 2020 12:40 pm**

DSMok1 wrote: ↑Mon Dec 21, 2020 12:21 pm When I run predictions, I have found that I generally get best results using 2 priors. One is a general prior, of around -3 (or a little lower for rookies) and then a second prior based upon Team efficiency + MPG or something like that. That second prior is only used if players have a decent sample on a team.

Those two could be merged for a more general application.

I don't understand why you would want to use a general prior of -3. This seems like a bad idea to me. Can you share your thought process behind this?

Posted: **Sat Dec 26, 2020 12:46 pm**

Those two priors combined provide shrinkage to an average of approximately 0. Just found by experimentation. In general, remember that we are working with the tail end of a much larger population distribution. what this means is that if we know absolutely nothing about a player, we should generally just regress them downward. If we know that they are playing minutes and on a decent team, then we know more than that.

Posted: **Sat Dec 26, 2020 1:00 pm**

DSMok1 wrote: ↑Sat Dec 26, 2020 12:46 pm Those two priors combined provide shrinkage to an average of approximately 0. Just found by experimentation. In general, remember that we are working with the tail end of a much larger population distribution. what this means is that if we know absolutely nothing about a player, we should generally just regress them downward. If we know that they are playing minutes and on a decent team, then we know more than that.

I see, that makes more sense to me if the two priors combined average out to ~0.

How does this approach handle strong players on bad teams though? Like KAT for example? Yes, I know, maybe he's not as strong as his box scores indicate, but it seems to me like someone like him might suffer from this approach? Is this a valid concern?

Posted: **Sun Dec 27, 2020 5:52 pm**

In general, the priors only significantly impact players with relatively few minutes. That actually tends to help the good players on the bad teams in RAPM, because those large numbers of low-minute players are not all averaging up around 0. Bad teams tend to have a lot more low minutes players.

Does that make sense?

APBRmetrics

Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM

Re: Incorporating Prior Information into RAPM