RAPM request thread

Home for all your discussion of basketball statistical analysis.
Post Reply
A Gravity Well
Posts: 24
Joined: Mon Jun 23, 2014 1:38 am

Re: RAPM request thread

Post by A Gravity Well »

J.E. wrote:
A Gravity Well wrote:I had then used a number of different columns representing different score differentials (hometeamup4, hometeamup9, etc)
This should work. One column for each possible score differential, although you might want put all situations with e.g. score_diff>30 into one bucket
Makes sense. With a big enough data set, could probably combine a few columns into different buckets (14-17, 18-20, etc) if their values are ~=.
For HCA one column should be enough, which is simply switched on during home team offensive possessions
Is there no HCA influence on defensive possessions?
Alternatively you could adjust the results vector Y by 'average_home_eff_per_possession' and 'average_away_eff_per_possession'.
So, if the home team scores 3 points, the result turns into, say, 3-1.07 = 1.93.
If the away team scores 3 points, the result turns into 3-1.04 = 1.96

Having a centered Y is good to have, anyway
Hmm. I'll put some thought into that. I never considered doing it like that.
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: RAPM request thread

Post by Nate »

Let's take a step back for a moment.

When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?
This should work. One column for each possible score differential, although you might want put all situations with e.g. score_diff>30 into one bucket
So the assumption is that every team is affected by point differentials in the same way?
A Gravity Well
Posts: 24
Joined: Mon Jun 23, 2014 1:38 am

Re: RAPM request thread

Post by A Gravity Well »

Nate wrote:Let's take a step back for a moment.

When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?
The latter. Home court effects, score effects, perhaps even time effects.
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: RAPM request thread

Post by Nate »

A Gravity Well wrote:
Nate wrote: When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?
The latter. Home court effects, score effects, perhaps even time effects.
Ah, then I misunderstood what you were asking for. Having one variable per score differential does make sense in that case, though you should probably look at those coefficients to see if anything unexpected is going on there.
A Gravity Well
Posts: 24
Joined: Mon Jun 23, 2014 1:38 am

Re: RAPM request thread

Post by A Gravity Well »

Priors. The problem is priors.

Apparently there's no way to use a prior if you're using glmnet? Well, that's not entirely true -- apparently you can use a prior by including more rows -- from the previous season or seasons, or of a pre-baked SPM -- and weigh them differently? How does that look?

Say a player plays 3000 offensive possessions in a season. Each of these are weighed equally -- or perhaps not, with considerations for recency if you're building a model. But for our purposes right now, let's say they are weighed equally. Would the SPM, then, where every other possession is weighted 1, would the SPM then have a weight of 3000 for a 50/50 split? 3000 each for a player's SPM and a player's cumulative possessions?

This strikes me as equally clunky and elegant. A patchwork solution the sum of hard-pressed creativity.

Is there another way to do this? Any examples?
Nate
Posts: 132
Joined: Tue Feb 24, 2015 2:35 pm

Re: RAPM request thread

Post by Nate »

A Gravity Well wrote:Priors. The problem is priors.

Apparently there's no way to use a prior if you're using glmnet? Well, that's not entirely true -- apparently you can use a prior by including more rows -- from the previous season or seasons, or of a pre-baked SPM -- and weigh them differently? How does that look?
...
Is there another way to do this? Any examples?
Ridge regression incorporates some types of assumptions it basically pulls everyone 'toward the average'. You can also take the result of the linear regression, and feed it into some secondary process. Another option is to use bayesian inference.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: RAPM request thread

Post by DSMok1 »

Nate wrote: Ridge regression incorporates some types of assumptions it basically pulls everyone 'toward the average'. You can also take the result of the linear regression, and feed it into some secondary process.
Unfortunately, when you're trying to solve multicollinearity issues, this second method really doesn't help at all. There will still be the major random errors.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
fpliii
Posts: 85
Joined: Fri May 10, 2013 1:38 pm

Re: RAPM request thread

Post by fpliii »

J.E. - It's probably too early for a complete 15-16 dataset (I think last year the first full set was through the all-star break), but how are these guys doing in overall NPI RAPM for the season:

Russell Westbrook
Stephen Curry
Kawhi Leonard
LeBron James
Draymond Green
Kevin Durant
James Harden
Chris Paul
Anthony Davis

?

Also, how about Tim Duncan, DeAndre Jordan, Andre Drummond, Pau Gasol, Hassan Whiteside in single season DRAPM?

Thanks as always for your work.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: RAPM request thread

Post by J.E. »

What's a "full"/"complete" dataset? Isn't a season only then complete when the season is over?

Anyway, I posted this https://t.co/C3sHzpJokl on Twitter a couple of hours ago
fpliii
Posts: 85
Joined: Fri May 10, 2013 1:38 pm

Re: RAPM request thread

Post by fpliii »

J.E. wrote:What's a "full"/"complete" dataset? Isn't a season only then complete when the season is over?

Anyway, I posted this https://t.co/C3sHzpJokl on Twitter a couple of hours ago
Sorry, I meant full/complete as in for all players in the league (as opposed for a handful).

Thanks, I hadn't checked your twitter. Appreciate it.
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: RAPM request thread

Post by Crow »

So Westbrook is estimated at plus 3.2 on NPI RAPM and plus 11 on RPM??


NPI RAPM has none estimated over plus 6, only 4 over plus 4 and 17 over plus 3. RPM has 9 over plus 6, 21 over plus 4 and 40 over plus 3.

In comparing the two, it sorta looks like RPM is reversing the impact of simple regularization. But JE what is your explanation or commentary?
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: RAPM request thread

Post by Crow »

I am starting to check actual lineup performance vs. sum of player estimates for each metric. I would think a comprehensive comparison would be useful. Has this been done or does anyone with the chops to organize it want to check the correlations? What are the arguments against this test or considering it a "validity" test? I know lineups will have both signal and noise. Would a comprehensive test allow us to compute estimates of average signal and noise strength, overall and by lineup minute size? If yes, those could perhaps help in the guesswork study of lineup performance, at least for the brave / eager / willing to try (vs. not trying)?
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: RAPM request thread

Post by J.E. »

Crow wrote:In comparing the two, it sorta looks like RPM is reversing the impact of simple regularization. But JE what is your explanation or commentary?
This is not a case of reversing the impact of regularization. RAPM with less regularization (lower lambda) would lead to significantly different (worse, obviously) results. You'd see low MP players getting "crazy" estimates, for one thing

The estimates in single year RAPM are low because because there's not alot of data - at this point of the season, anyway.

(Single year) RAPM is a significantly worse metric than RPM. I just post results because some people want to see it
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: RAPM request thread

Post by Crow »

Just one case but Westbrook's estimates somewhat confuse / concern me. By NPI RAPM his estimates are virtually unchanged, just going from plus 2.9 last season to 3.2 this season. But RPM jumps from plus 7 to 11. NPI RAPM and RPM differ by 8 pts this season and being early in season may affect but there was still a fairly large 4pt difference for the full season numbers last season. His prior is probably more helpful this season compared to last and his box score is better but is his team impact better this year? If it is, why isn't the NPI RAPM significantly better? If it is the small sample, does that imply we should expect to see Westbrook's NPI RAPM to go up as this season progresses? If it does, will his RPM go even higher too?

More thinking / checking to do.
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: RAPM request thread

Post by J.E. »

Somewhat relevant: It may be important to note that the extremely good/bad players are those that NPI RAPM probably has the most trouble getting right

In a way, it doesn't expect players to be "that good", because, *generally*, they aren't. But sometimes, a player really is that good
Post Reply