Page 2 of 2
Re: Incorporating Prior Information into RAPM
Posted: Mon Dec 14, 2020 11:49 pm
by vzografos
rainmantrail wrote: ↑Mon Dec 14, 2020 10:12 pm Another design option would be to have each player going up against the average defender in these additional rows, and have them defending against the average offensive player for their defensive priors.
Interesting. If your priors come from a specific season (say last season) then I would suggest using the average defender/offsensive player of THAT particular season.
Also interesting
https://www.sloansportsconference.com/p ... hardt-lane
I didnt know about him. I guess now he is a trader
https://www.3redpartners.com/leadership/
I suppose a lot of sports betting market inefficiencies are now being exploited by quants.
Re: Incorporating Prior Information into RAPM
Posted: Tue Dec 15, 2020 3:19 am
by rainmantrail
Apparently he's also an associate professor now at NYU as well. A friend of mine is enrolled in the data science MS program there and he's teaching a course on predictive modeling in sports next semester. He was also working for some MLB team last I heard.
Re: Incorporating Prior Information into RAPM
Posted: Tue Dec 15, 2020 11:10 am
by vzografos
And that's why you should stay away from sportsbooks. Only try open exchanges (if they exist in the US which I doubt)

Re: Incorporating Prior Information into RAPM
Posted: Tue Dec 15, 2020 12:26 pm
by DSMok1
If you use the "additional rows" method, the statistical shrinkage may be at cross-purposes with the prior. For instance--the low minutes players will be dominated by the prior and the shrinkage. The prior toward something low--maybe around -3.5 per 100, and the shrinkage pulling toward 0.
How would you determine how many "observations" of each prior is the best? There are two unknowns, then--what is the prior, and how many minutes do you use for it? Or do you use the same minutes for all players?
Re: Incorporating Prior Information into RAPM
Posted: Mon Dec 21, 2020 4:43 am
by rainmantrail
DSMok1 wrote: ↑Tue Dec 15, 2020 12:26 pm
If you use the "additional rows" method, the statistical shrinkage may be at cross-purposes with the prior. For instance--the low minutes players will be dominated by the prior and the shrinkage. The prior toward something low--maybe around -3.5 per 100, and the shrinkage pulling toward 0.
How would you determine how many "observations" of each prior is the best? There are two unknowns, then--what is the prior, and how many minutes do you use for it? Or do you use the same minutes for all players?
I get what you're saying. Functionally, the regularization shrinkage acts as a Bayesian prior with a Gaussian distribution centered around zero. But we can minimize this effect with a sufficiently large enough sample for each player. The balance of how many possessions to award the priors in this framework might be difficult to optimize in any way other than brute force. But I would lean toward awarding the same number of 'prior possessions' as the weights for each player coming into a new season. If we were less confident about someone's priors though, we may want to lessen the weight of their priors. Hmm. But then, we might just effectively end up with a multi-year RAPM model. But we could still tinker with the weights to improve it's performance. As well as accounting for other elements. Ultimately, I plan to just try different values and see which ones are more predictive. Same with measuring them against the other approach where you remove the value of their priors from the target variable instead, then add them back in after the regularization. The best way to know which is better is to try them out and see which is more predictive. The problem with the subtract then add back in approach is that you can't weight the priors.
Re: Incorporating Prior Information into RAPM
Posted: Mon Dec 21, 2020 12:21 pm
by DSMok1
When I run predictions, I have found that I generally get best results using 2 priors. One is a general prior, of around -3 (or a little lower for rookies) and then a second prior based upon Team efficiency + MPG or something like that. That second prior is only used if players have a decent sample on a team.
Those two could be merged for a more general application.
Re: Incorporating Prior Information into RAPM
Posted: Mon Dec 21, 2020 2:53 pm
by vzografos
all that manual tweaking....

Re: Incorporating Prior Information into RAPM
Posted: Sat Dec 26, 2020 12:40 pm
by rainmantrail
DSMok1 wrote: ↑Mon Dec 21, 2020 12:21 pm
When I run predictions, I have found that I generally get best results using 2 priors. One is a general prior, of around -3 (or a little lower for rookies) and then a second prior based upon Team efficiency + MPG or something like that. That second prior is only used if players have a decent sample on a team.
Those two could be merged for a more general application.
I don't understand why you would want to use a general prior of -3. This seems like a bad idea to me. Can you share your thought process behind this?
Re: Incorporating Prior Information into RAPM
Posted: Sat Dec 26, 2020 12:46 pm
by DSMok1
Those two priors combined provide shrinkage to an average of approximately 0. Just found by experimentation. In general, remember that we are working with the tail end of a much larger population distribution. what this means is that if we know absolutely nothing about a player, we should generally just regress them downward. If we know that they are playing minutes and on a decent team, then we know more than that.
Re: Incorporating Prior Information into RAPM
Posted: Sat Dec 26, 2020 1:00 pm
by rainmantrail
DSMok1 wrote: ↑Sat Dec 26, 2020 12:46 pm
Those two priors combined provide shrinkage to an average of approximately 0. Just found by experimentation. In general, remember that we are working with the tail end of a much larger population distribution. what this means is that if we know absolutely nothing about a player, we should generally just regress them downward. If we know that they are playing minutes and on a decent team, then we know more than that.
I see, that makes more sense to me if the two priors combined average out to ~0.
How does this approach handle strong players on bad teams though? Like KAT for example? Yes, I know, maybe he's not as strong as his box scores indicate, but it seems to me like someone like him might suffer from this approach? Is this a valid concern?
Re: Incorporating Prior Information into RAPM
Posted: Sun Dec 27, 2020 5:52 pm
by DSMok1
In general, the priors only significantly impact players with relatively few minutes. That actually tends to help the good players on the bad teams in RAPM, because those large numbers of low-minute players are not all averaging up around 0. Bad teams tend to have a lot more low minutes players.
Does that make sense?