RAPM request thread

A Gravity Well · Post by **A Gravity Well** » Fri Oct 16, 2015 10:35 pm

J.E. wrote:
A Gravity Well wrote:I had then used a number of different columns representing different score differentials (hometeamup4, hometeamup9, etc)
This should work. One column for each possible score differential, although you might want put all situations with e.g. score_diff>30 into one bucket

Makes sense. With a big enough data set, could probably combine a few columns into different buckets (14-17, 18-20, etc) if their values are ~=.

For HCA one column should be enough, which is simply switched on during home team offensive possessions

Is there no HCA influence on defensive possessions?

Alternatively you could adjust the results vector Y by 'average_home_eff_per_possession' and 'average_away_eff_per_possession'.
So, if the home team scores 3 points, the result turns into, say, 3-1.07 = 1.93.
If the away team scores 3 points, the result turns into 3-1.04 = 1.96

Having a centered Y is good to have, anyway

Hmm. I'll put some thought into that. I never considered doing it like that.

Nate · Post by **Nate** » Sat Oct 17, 2015 12:14 am

Let's take a step back for a moment.

When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?

This should work. One column for each possible score differential, although you might want put all situations with e.g. score_diff>30 into one bucket

So the assumption is that every team is affected by point differentials in the same way?

A Gravity Well · Post by **A Gravity Well** » Sat Oct 17, 2015 3:09 am

Nate wrote:Let's take a step back for a moment.

When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?

The latter. Home court effects, score effects, perhaps even time effects.

Nate · Post by **Nate** » Sat Oct 17, 2015 2:30 pm

A Gravity Well wrote:
Nate wrote: When you talk about 'adding HCA' are you talking about estimating how individual players' performance changes with home/away games, or trying to produce RAPM numbers that are 'controlled for' home court effects?
The latter. Home court effects, score effects, perhaps even time effects.

Ah, then I misunderstood what you were asking for. Having one variable per score differential does make sense in that case, though you should probably look at those coefficients to see if anything unexpected is going on there.

A Gravity Well · Post by **A Gravity Well** » Mon Oct 19, 2015 4:15 am

Priors. The problem is priors.

Apparently there's no way to use a prior if you're using glmnet? Well, that's not entirely true -- apparently you can use a prior by including more rows -- from the previous season or seasons, or of a pre-baked SPM -- and weigh them differently? How does that look?

Say a player plays 3000 offensive possessions in a season. Each of these are weighed equally -- or perhaps not, with considerations for recency if you're building a model. But for our purposes right now, let's say they are weighed equally. Would the SPM, then, where every other possession is weighted 1, would the SPM then have a weight of 3000 for a 50/50 split? 3000 each for a player's SPM and a player's cumulative possessions?

This strikes me as equally clunky and elegant. A patchwork solution the sum of hard-pressed creativity.

Is there another way to do this? Any examples?

Nate · Post by **Nate** » Mon Oct 19, 2015 5:03 pm

A Gravity Well wrote:Priors. The problem is priors.

Apparently there's no way to use a prior if you're using glmnet? Well, that's not entirely true -- apparently you can use a prior by including more rows -- from the previous season or seasons, or of a pre-baked SPM -- and weigh them differently? How does that look?
...
Is there another way to do this? Any examples?

Ridge regression incorporates some types of assumptions it basically pulls everyone 'toward the average'. You can also take the result of the linear regression, and feed it into some secondary process. Another option is to use bayesian inference.

DSMok1 · Post by **DSMok1** » Thu Oct 22, 2015 1:01 pm

Nate wrote: Ridge regression incorporates some types of assumptions it basically pulls everyone 'toward the average'. You can also take the result of the linear regression, and feed it into some secondary process.

Unfortunately, when you're trying to solve multicollinearity issues, this second method really doesn't help at all. There will still be the major random errors.

fpliii · Post by **fpliii** » Mon Jan 04, 2016 3:52 pm

J.E. - It's probably too early for a complete 15-16 dataset (I think last year the first full set was through the all-star break), but how are these guys doing in overall NPI RAPM for the season:

Russell Westbrook
Stephen Curry
Kawhi Leonard
LeBron James
Draymond Green
Kevin Durant
James Harden
Chris Paul
Anthony Davis

?

Also, how about Tim Duncan, DeAndre Jordan, Andre Drummond, Pau Gasol, Hassan Whiteside in single season DRAPM?

Thanks as always for your work.

J.E. · Post by **J.E.** » Mon Jan 04, 2016 4:13 pm

What's a "full"/"complete" dataset? Isn't a season only then complete when the season is over?

Anyway, I posted this https://t.co/C3sHzpJokl on Twitter a couple of hours ago

fpliii · Post by **fpliii** » Mon Jan 04, 2016 4:14 pm

J.E. wrote:What's a "full"/"complete" dataset? Isn't a season only then complete when the season is over?

Anyway, I posted this https://t.co/C3sHzpJokl on Twitter a couple of hours ago

Sorry, I meant full/complete as in for all players in the league (as opposed for a handful).

Thanks, I hadn't checked your twitter. Appreciate it.

Crow · Post by **Crow** » Mon Jan 04, 2016 6:26 pm

So Westbrook is estimated at plus 3.2 on NPI RAPM and plus 11 on RPM??

NPI RAPM has none estimated over plus 6, only 4 over plus 4 and 17 over plus 3. RPM has 9 over plus 6, 21 over plus 4 and 40 over plus 3.

In comparing the two, it sorta looks like RPM is reversing the impact of simple regularization. But JE what is your explanation or commentary?

Crow · Post by **Crow** » Mon Jan 04, 2016 7:09 pm

I am starting to check actual lineup performance vs. sum of player estimates for each metric. I would think a comprehensive comparison would be useful. Has this been done or does anyone with the chops to organize it want to check the correlations? What are the arguments against this test or considering it a "validity" test? I know lineups will have both signal and noise. Would a comprehensive test allow us to compute estimates of average signal and noise strength, overall and by lineup minute size? If yes, those could perhaps help in the guesswork study of lineup performance, at least for the brave / eager / willing to try (vs. not trying)?

J.E. · Post by **J.E.** » Mon Jan 04, 2016 7:50 pm

Crow wrote:In comparing the two, it sorta looks like RPM is reversing the impact of simple regularization. But JE what is your explanation or commentary?

This is not a case of reversing the impact of regularization. RAPM with less regularization (lower lambda) would lead to significantly different (worse, obviously) results. You'd see low MP players getting "crazy" estimates, for one thing

The estimates in single year RAPM are low because because there's not alot of data - at this point of the season, anyway.

(Single year) RAPM is a significantly worse metric than RPM. I just post results because some people want to see it

Crow · Post by **Crow** » Mon Jan 04, 2016 8:20 pm

Just one case but Westbrook's estimates somewhat confuse / concern me. By NPI RAPM his estimates are virtually unchanged, just going from plus 2.9 last season to 3.2 this season. But RPM jumps from plus 7 to 11. NPI RAPM and RPM differ by 8 pts this season and being early in season may affect but there was still a fairly large 4pt difference for the full season numbers last season. His prior is probably more helpful this season compared to last and his box score is better but is his team impact better this year? If it is, why isn't the NPI RAPM significantly better? If it is the small sample, does that imply we should expect to see Westbrook's NPI RAPM to go up as this season progresses? If it does, will his RPM go even higher too?

More thinking / checking to do.

J.E. · Post by **J.E.** » Mon Jan 04, 2016 10:58 pm

Somewhat relevant: It may be important to note that the extremely good/bad players are those that NPI RAPM probably has the most trouble getting right

In a way, it doesn't expect players to be "that good", because, *generally*, they aren't. But sometimes, a player really is that good

APBRmetrics

RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread

Re: RAPM request thread