Predictive test using lineups (Updated with results)
Predictive test using lineups (Updated with results)
Using matchup files, I am going to run a test to see the predictive value of 6 different stats (BPM, WS/48, PI RAPM, NPI RAPM, xRAPM, and WP). For players with no previous experience, they get a replacement level score (-2 or 0.040 for WS and WP).
Here is the spreadsheet with my data. You will probably have to download the file to view it. Its a huge file.
https://drive.google.com/file/d/0ByRxUN ... sp=sharing
What is an effective way for me to run the test so that I can determine the predictive value of these stats? Once I figure that out, I will run a test for 1 year, 2 year, and 3 year predictiveness.
UPDATE:
Here are my results. This is lineups with 500+ possessions played in a season.
Year 1
Year 1
WS 0.489
xRAPM 0.480
BPM 0.463
RAPM 0.453
WP 0.430
NPI 0.412
Year 2
BPM 0.366
RAPM 0.359
xRAPM 0.352
WP 0.342
WS 0.339
NPI 0.290
Year 3
RAPM 0.472
NPI 0.432
xRAPM 0.407
WS 0.401
BPM 0.384
WP 0.339
Average R value from those 3 years:
Total
RAPM 0.428
xRAPM 0.413
WS 0.410
BPM 0.405
NPI 0.378
WP 0.370
It looks like RAPM and xRAPM fared well. Some of the results do look weird like year 3 more predictive than year 2 but that might be a function of the small sample size. The R values do change a bit too if I look at Net points instead of Net points per 100. WP finished in last. It did awful in year n-3.
Here is the spreadsheet with my data. You will probably have to download the file to view it. Its a huge file.
https://drive.google.com/file/d/0ByRxUN ... sp=sharing
What is an effective way for me to run the test so that I can determine the predictive value of these stats? Once I figure that out, I will run a test for 1 year, 2 year, and 3 year predictiveness.
UPDATE:
Here are my results. This is lineups with 500+ possessions played in a season.
Year 1
Year 1
WS 0.489
xRAPM 0.480
BPM 0.463
RAPM 0.453
WP 0.430
NPI 0.412
Year 2
BPM 0.366
RAPM 0.359
xRAPM 0.352
WP 0.342
WS 0.339
NPI 0.290
Year 3
RAPM 0.472
NPI 0.432
xRAPM 0.407
WS 0.401
BPM 0.384
WP 0.339
Average R value from those 3 years:
Total
RAPM 0.428
xRAPM 0.413
WS 0.410
BPM 0.405
NPI 0.378
WP 0.370
It looks like RAPM and xRAPM fared well. Some of the results do look weird like year 3 more predictive than year 2 but that might be a function of the small sample size. The R values do change a bit too if I look at Net points instead of Net points per 100. WP finished in last. It did awful in year n-3.
Last edited by colts18 on Fri Feb 13, 2015 5:17 pm, edited 2 times in total.
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
As others have made clear in previous threads on predictions, the tests should be completely out of sample, which (hopefully) includes the sample on which the metric was calculated. This applies primarily to BPM and any RAPM with a prior, since they'll carry information over from one season to another. If you can get data from the ongoing season, the easiest thing would be to just take what you have and start predicting upcoming games as they happen.
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
The only stat that would apply to would be BPM because it was tested in sample but it was a 14 year sample. The test I'm running is completely out of sample. For example, I am testing 2011 Data using 2010, 2009, 2008 RAPM/WS/WP/etc.,xkonk wrote:As others have made clear in previous threads on predictions, the tests should be completely out of sample, which (hopefully) includes the sample on which the metric was calculated. This applies primarily to BPM and any RAPM with a prior, since they'll carry information over from one season to another. If you can get data from the ongoing season, the easiest thing would be to just take what you have and start predicting upcoming games as they happen.
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
While you are at it, could you also run some blends? Of BPM and some form(s) of RAPM, with BPM at a 75%, 50% and perhaps a lower weight. Even an evenly weighted of all would be interesting to see. The more the better to me in search of the best performer.
Have you seen the work of Alex, the sports skeptic?
The last part of his study https://sportskeptic.wordpress.com/2012 ... ect-blend/ Also see parts 1 and 2.
Have you seen the work of Alex, the sports skeptic?
The last part of his study https://sportskeptic.wordpress.com/2012 ... ect-blend/ Also see parts 1 and 2.
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
Here is the N-1 test. This is lineups with 250+ possessions and 500+ possessions (R values).
Stat 250+ 500+
xRAPM 0.346 0.480
RAPM 0.331 0.453
BPM 0.331 0.463
WP 0.311 0.430
WS 0.311 0.426
NPI 0.295 0.412
I'm working on N-2 and N-3.
xRAPM looks the best by this methodology.
Stat 250+ 500+
xRAPM 0.346 0.480
RAPM 0.331 0.453
BPM 0.331 0.463
WP 0.311 0.430
WS 0.311 0.426
NPI 0.295 0.412
I'm working on N-2 and N-3.
xRAPM looks the best by this methodology.
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
xRAPM is a integral blend of sorts. Still think further blending with it might beat it.
-
- Posts: 105
- Joined: Thu Jul 26, 2012 8:49 pm
- Location: Dallas, TX
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
That's not out of sample. BPM was found by regression onto J.E.'s 14 year rapm. The 14 year rapm includes 2011. So by definition 2011 is in sample whether you're using 2009 bpm or 1978 bpm or whatever.colts18 wrote:The test I'm running is completely out of sample. For example, I am testing 2011 Data using 2010, 2009, 2008 RAPM/WS/WP/etc.,
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
I admitted that BPM was in sample. RAPM is definitely not in sample. Im not using 14 year RAPM. I'm using the yearly Prior Informed RAPM (no boxscore info) from J.E..jbrocato23 wrote:That's not out of sample. BPM was found by regression onto J.E.'s 14 year rapm. The 14 year rapm includes 2011. So by definition 2011 is in sample whether you're using 2009 bpm or 1978 bpm or whatever.colts18 wrote:The test I'm running is completely out of sample. For example, I am testing 2011 Data using 2010, 2009, 2008 RAPM/WS/WP/etc.,
Re: Predictive test using lineups (RAPM, WP, WS, BPM, etc.)
I updated the OP with my results but I'm not 100% sure that I did it exactly right.
Re: Predictive test using lineups (Updated with results)
WinShares went from next to last to best in the second run of year 1?
And just to be certain, are these R values or R squared?
And just to be certain, are these R values or R squared?
Re: Predictive test using lineups (Updated with results)
Those are R values. For some reason, Win Shares was giving me some odd values. For example, I used weighted possessions for my calculations. I correlated Net differential to Weighted sum of each of stat. Then I correlated to Weighted average. For each stat, the correlation was the same (meaning it didn't matter if I weighted average because the sum would work fine) but it did not do that for Win Shares for some reason.Crow wrote:WinShares went from next to last to best in the second run of year 1?
And just to be certain, are these R values or R squared?
I might have to run this again with a different methodology because I'm not sure if my results can fully pass the smell test.
Re: Predictive test using lineups (Updated with results)
With those R values none of the player based metrics are doing much good predicting lineup productivity, so perhaps lineups are worth more analysis directly to understand their unit behavior. Its lineup productivity that wins games.
Or perhaps lineup productivity is too chaotic and /or hindered by smallish samples (even at 500 possessions) and the focus should be back on sum of players, linear or close to it. Got to test to have a basis for deciding.
How much stronger are the correlations at 1000 or 1500 possessions? I know few lineups get that high in a season but if they are more predictable, some teams might benefit from going with more time in such (good) lineups over rolling dice. Rolling up lineups into multi-season bundles would be taking the research in a different direction but is probably worthwhile.
Or perhaps lineup productivity is too chaotic and /or hindered by smallish samples (even at 500 possessions) and the focus should be back on sum of players, linear or close to it. Got to test to have a basis for deciding.
How much stronger are the correlations at 1000 or 1500 possessions? I know few lineups get that high in a season but if they are more predictable, some teams might benefit from going with more time in such (good) lineups over rolling dice. Rolling up lineups into multi-season bundles would be taking the research in a different direction but is probably worthwhile.
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: Predictive test using lineups (Updated with results)
For the record my player productivity findings was shortly
1. xRAPM
2. BPM
3. NPI-RAPM
4. WS
5. MPG
Tested years didn't have any impact on the order.
1. xRAPM
2. BPM
3. NPI-RAPM
4. WS
5. MPG
Tested years didn't have any impact on the order.