Best practices for SPM development?

Home for all your discussion of basketball statistical analysis.
Post Reply
RyanRiot
Posts: 23
Joined: Wed Oct 19, 2016 2:26 am

Best practices for SPM development?

Post by RyanRiot » Mon Mar 25, 2019 9:20 pm

At a basic level, developing a SPM model just involves regressing some player stats against a RAPM sample, but what is the best way to do that? Is it:

1. Regress a multi-year player sample against a multi-year RAPM sample (i.e. Kevin Durant's 2010-2015 box score stats against Durant's 2010-2015 RAPM)

2. Regress individual player seasons against a multi-year RAPM sample (i.e. Kevin Durant's 2010, 2011, 2012, 2013, 2014, and 2015 box score stats against Durant's 2010-2015 RAPM)

3. Regress individual player seasons against only that year's RAPM (i.e Kevin Durant's 2010 box score stats against his 2010 RAPM)

4. Something else

Crow
Posts: 6146
Joined: Thu Apr 14, 2011 11:10 pm

Re: Best practices for SPM development?

Post by Crow » Mon Mar 25, 2019 10:15 pm

If you are trying to understand a season, do #3. #1 might be good too in some yr weighted fashion. If you are trying to project, probably some version of #1.



My other opinion / advice would be to make sure defense gets equal weight. Probably display separate. And do something on shot defense vs. leaving it out.

DSMok1
Posts: 902
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Best practices for SPM development?

Post by DSMok1 » Tue Mar 26, 2019 1:40 pm

RyanRiot wrote:
Mon Mar 25, 2019 9:20 pm
At a basic level, developing a SPM model just involves regressing some player stats against a RAPM sample, but what is the best way to do that? Is it:

1. Regress a multi-year player sample against a multi-year RAPM sample (i.e. Kevin Durant's 2010-2015 box score stats against Durant's 2010-2015 RAPM)

2. Regress individual player seasons against a multi-year RAPM sample (i.e. Kevin Durant's 2010, 2011, 2012, 2013, 2014, and 2015 box score stats against Durant's 2010-2015 RAPM)

3. Regress individual player seasons against only that year's RAPM (i.e Kevin Durant's 2010 box score stats against his 2010 RAPM)

4. Something else
For BPM I sort of did a hybrid of 1 and 2--I used the individual season statistics, but then aggregated them iteratively within the regression onto the multi-year RAPM. This was computationally challenging, so I used method 1 for feature selection and then the hybrid approach for fine-tuning the coefficients.

Good Multi-year RAPM is the key--single year RAPM is quite noisy and thus ends up with a bunch of statistical shrinkage to reduce that noise. Not a lot of signal to work with.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

sbs
Posts: 14
Joined: Fri Oct 19, 2012 7:25 am

Re: Best practices for SPM development?

Post by sbs » Wed Mar 27, 2019 9:48 pm

I think the right answer is whatever ends up being best out-of-sample.

I do number #1 - however, to get the right SPM blend I use cross validation with each of the seasons within data set.

So you'll get the following for a 2000-2018 data set:
Multi-Year RAPM 2000-2018 (- 2000) + Box Score 2000-2018 (- 2000)
Multi-Year RAPM 2000-2018 (- 2001) + Box Score 2000-2018 (- 2001)
Multi-Year RAPM 2000-2018 (- 2002) + Box Score 2000-2018 (- 2002)
...
Multi-Year RAPM 2000-2018 (- 2018) + Box Score 2000-2018 (- 2018)

It's computationally expensive to generate box score data and multi-year RAPM by excluding each year but it allows you test OOS against individual seasons.

Post Reply