Hi everyone,
I'm thinking of extending my approach to build an SPM for the NBA and for the game basketballgm.
The model I’m working on uses **RAPM from the JE dataset** as a foundation, aiming to combine its stability with box score and potentially tracking data to create a metric that’s both predictive and interpretable. For BBGM, i was able to create a fork of the game which was able to calculate accurate RAPM values for each player. If you would like to see the fork of the rapm code I will be happy to provide this. I’d appreciate your thoughts on a few key questions:
---
1. Data Sources
I’m currently using box score data and team ratings but would like to incorporate richer tracking or contextual data. For example:
- Shot location data, defensive contests, off-ball movement
- Screen assists, rim protection metrics, and lineup combinations
- Team offensive/defensive ratings, adjusted for possession context
- Play-by-play data
I’ve explored options like `nba_api`, **PBP Stats**, and **Basketball-Reference**, but I know many of you have far more experience sourcing robust datasets. What are your recommendations for tracking or contextual data sources, especially for defensive and off-ball metrics? If anyone has insights into preprocessing or transforming this data for model inputs, I’d love to hear about that too.
---
2. Best Window for RAPM
In terms of RAPM, I’ve been debating the best number of years to include. My current thinking is:
Either a 5 or 3 year RAPM.
What has worked best for you? Is there an optimal balance here for projects like this?
---
3. Playoff Data
Do you include playoff games in your RAPM calculations, or do you stick strictly to regular-season data? I’m torn because playoff games are high-leverage, but the small sample might overweight individual performances.
---
4. Validation Set and Splits
I’m also trying to establish a solid validation framework. How do you typically construct your validation set?
- Do you use the following season (e.g., RAPM from the next year) as your validation data?
- What split of training, validation, and test data has worked best for you?
- Any advice on balancing in-season predictive accuracy with cross-season generalizability?
This part has been especially tricky for me since I want to ensure the model doesn’t overfit historical RAPM and remains predictive for unseen data.
---
5. Feature Engineering and Weighting
Finally, I’m exploring how best to engineer features and weigh observations. For instance:
- Are there particular features (e.g., interaction terms) that have consistently proven valuable?
- How do you handle outliers or weigh contributions from players with limited minutes?
---
6. Adding Priors
I’ve been considering adding a prior to the RAPM estimates to stabilize the model for players with limited data (e.g., low minutes or few possessions).
How do you decide on an appropriate prior, and what’s worked best for you?
Should the prior be based on league averages, positional averages, or something else entirely (e.g., aging curves for veteran players)?
If you have experience using priors to improve player evaluation, I’d love to hear how it’s impacted your results.
Also I'm thinking of using this spm as a prior in a single season's RAPM calculation just as was done in xRAPM and EPM. How did you go about doing this, and EPM talks about something called a bayesian prior. How is this different from a normal prior for the RAPM calcualtion
---
7. Splitting RAPM into Offensive and Defensive Components
I’m also curious about the best approach to handling offensive and defensive RAPM. Is it better to create a single overall RAPM metric first and then split it into offensive and defensive components, or should offensive and defensive RAPM be modeled separately and then summed to form an overall metric? I’d love to hear about any trade-offs you’ve encountered with these approaches, particularly in terms of interpretability, stability, or validation results.
---
Thanks in advance for any advice or insights you can offer. I’m excited to hear how others have approached similar challenges. Looking forward to learning from everyone here!
Seeking Advice on Data, Validation, and Best Practices for Building an SPM
Re: Seeking Advice on Data, Validation, and Best Practices for Building an SPM
Hopefully you'll get fresh, targeted advice. Some useful advice can probably also be found in select past threads if you comb thru them.
I'd think 3 year is the best compromise if focused on current or future performance. 5 year if you are predominantly concerned with historical estimates.
I'd think 3 year is the best compromise if focused on current or future performance. 5 year if you are predominantly concerned with historical estimates.
Re: Seeking Advice on Data, Validation, and Best Practices for Building an SPM
Given that you're spending a lot of time and thought on this anyway, why not spend some additional time to run your own RAPM?
That'd give you the freedom to play around in e.g. regards to how many seasons to use, playoffs vs no playoffs, validation and test splits, among all kinds of other design decisions
In regards to data sources, I think it's reasonable to just go with nba_api
That'd give you the freedom to play around in e.g. regards to how many seasons to use, playoffs vs no playoffs, validation and test splits, among all kinds of other design decisions
In regards to data sources, I think it's reasonable to just go with nba_api
-
- Posts: 98
- Joined: Fri Sep 06, 2024 11:52 pm
Re: Seeking Advice on Data, Validation, and Best Practices for Building an SPM
SkyJuke wrote: ↑Mon Nov 25, 2024 7:53 am Hi everyone,
I'm thinking of extending my approach to build an SPM for the NBA and for the game basketballgm.
The model I’m working on uses **RAPM from the JE dataset** as a foundation, aiming to combine its stability with box score and potentially tracking data to create a metric that’s both predictive and interpretable. For BBGM, i was able to create a fork of the game which was able to calculate accurate RAPM values for each player. If you would like to see the fork of the rapm code I will be happy to provide this. I’d appreciate your thoughts on a few key questions:
---
1. Data Sources
I’m currently using box score data and team ratings but would like to incorporate richer tracking or contextual data. For example:
- Shot location data, defensive contests, off-ball movement
- Screen assists, rim protection metrics, and lineup combinations
- Team offensive/defensive ratings, adjusted for possession context
- Play-by-play data
I’ve explored options like `nba_api`, **PBP Stats**, and **Basketball-Reference**, but I know many of you have far more experience sourcing robust datasets. What are your recommendations for tracking or contextual data sources, especially for defensive and off-ball metrics? If anyone has insights into preprocessing or transforming this data for model inputs, I’d love to hear about that too.
---
2. Best Window for RAPM
In terms of RAPM, I’ve been debating the best number of years to include. My current thinking is:
Either a 5 or 3 year RAPM.
What has worked best for you? Is there an optimal balance here for projects like this?
---
3. Playoff Data
Do you include playoff games in your RAPM calculations, or do you stick strictly to regular-season data? I’m torn because playoff games are high-leverage, but the small sample might overweight individual performances.
---
4. Validation Set and Splits
I’m also trying to establish a solid validation framework. How do you typically construct your validation set?
- Do you use the following season (e.g., RAPM from the next year) as your validation data?
- What split of training, validation, and test data has worked best for you?
- Any advice on balancing in-season predictive accuracy with cross-season generalizability?
This part has been especially tricky for me since I want to ensure the model doesn’t overfit historical RAPM and remains predictive for unseen data.
---
5. Feature Engineering and Weighting
Finally, I’m exploring how best to engineer features and weigh observations. For instance:
- Are there particular features (e.g., interaction terms) that have consistently proven valuable?
- How do you handle outliers or weigh contributions from players with limited minutes?
---
6. Adding Priors
I’ve been considering adding a prior to the RAPM estimates to stabilize the model for players with limited data (e.g., low minutes or few possessions).
How do you decide on an appropriate prior, and what’s worked best for you?
Should the prior be based on league averages, positional averages, or something else entirely (e.g., aging curves for veteran players)?
If you have experience using priors to improve player evaluation, I’d love to hear how it’s impacted your results.
Also I'm thinking of using this spm as a prior in a single season's RAPM calculation just as was done in xRAPM and EPM. How did you go about doing this, and EPM talks about something called a bayesian prior. How is this different from a normal prior for the RAPM calcualtion
---
7. Splitting RAPM into Offensive and Defensive Components
I’m also curious about the best approach to handling offensive and defensive RAPM. Is it better to create a single overall RAPM metric first and then split it into offensive and defensive components, or should offensive and defensive RAPM be modeled separately and then summed to form an overall metric? I’d love to hear about any trade-offs you’ve encountered with these approaches, particularly in terms of interpretability, stability, or validation results.
---
Thanks in advance for any advice or insights you can offer. I’m excited to hear how others have approached similar challenges. Looking forward to learning from everyone here!
Rip my post didn’t send
So I tried making one a few weeks ago, did some of the stuff on here and I’m pretty confident it’s at least in the same level as the current EPM or LEBRON at least end of season, so I’ll try to go over some stuff I feel based on my experience there altho I didn’t spend that much time into it lol so hardly a perfect process through it
That being said, going off the dome what I remember saying when I read this
When creating the box prior, you have to be careful with tracking data. Imo, it’s absolutely a help, but when you do that or interaction effects you run the risk of having a few names that are simply not supposed to be there or really large outliers one way or another. A way you can mitigate that is either adding caps to certain data (especially if it’s not a full season of data or for lower minute guys), or make sure if you add an interaction effect that outliers won’t occur, if that makes sense, I recall I did something with turnovers and potential assists
For minutes per 36 should be fine, make sure to weight observations tho
Iirc I used 2022 and 2023 and 2024 as test sets, 80-20 split from 2014-2021 as train validation, I used multi year rapm instead of splitting it into batches, not sure if that’s the norm but it worked pretty well. I don’t 100% remember but idt it was a particularly stressful area, but when you throw in too much tracking data u can kinda tell when it’s over fit when u apply it to single years of data for the SPM even without the test metrics honestly. I don’t 100% remember honestly which one I did
Are you more focused on raw predictive power within season, beyond the current season or presentabity as a public metric? That changes the approach I think
Agree with JE on running RAPM yourself. Including playoff data depends on what ur looking for, if for only the RS I wouldn’t
The box score priors tend to be the large seperate in these things, my thing used TDRAPM but I think the box prior alone tested somewhat as well as epm and lebron. Synergy stuff helped a ton for offense the offensive prior was peak, the defensive prior was meh but I def overvalued bigs (on purpose) which made the testing a tad worse I think
Rim protection I used rim points saved which was solid, I think I had it added with 0.3*blocks just to try to add in rim deterrence too, which probably didn’t help the metric in terms of predictive power but I think would help for players moving teams?
Per 36 stats, I feel 5-7 years is a good window for rapm 3 year sounds a bit noisy even if ur weighting observations by minutes
Weighting by “importance” makes sense too, like guessing bron > guessing like a random dude that played 50 minutes total or a random role player
An objective way to do that is a bit harder but some sort of award weighting worked well when I had to do it for the WNBA