How do you adjust multi-year RAPM for age?

Home for all your discussion of basketball statistical analysis.
Post Reply
azuko
Posts: 5
Joined: Mon Aug 02, 2021 1:01 am

How do you adjust multi-year RAPM for age?

Post by azuko »

I've seen examples of age-adjusted RAPM such as this 14 year RAPM dataset but I don't understand how it's calculated.

I understand how to calculate vanilla RAPM, but how do you factor in age adjustments if it's a multi-year regression? I just don't get it at all tbh
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: How do you adjust multi-year RAPM for age?

Post by DSMok1 »

azuko wrote: Mon Aug 02, 2021 1:07 am I've seen examples of age-adjusted RAPM such as this 14 year RAPM dataset but I don't understand how it's calculated.

I understand how to calculate vanilla RAPM, but how do you factor in age adjustments if it's a multi-year regression? I just don't get it at all tbh
It's all in the pre-processing and post-processing. Essentially, the observation is adjusted from the player's actual age to their age 27 (prime) age. So if the player is 20, then several points/100 possessions would be added to the observation to shift that player to age 27. This is done for all 10 players on the court.

That would yield "Age 27" ratings for all players from the RAPM.

Then, in post-processing, the age adjustment is subtracted back out, to yield the RAPM's best estimate of what that player actually did, averaged over the entire RAPM period.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
azuko
Posts: 5
Joined: Mon Aug 02, 2021 1:01 am

Re: How do you adjust multi-year RAPM for age?

Post by azuko »

DSMok1 wrote: Mon Aug 02, 2021 11:48 am
azuko wrote: Mon Aug 02, 2021 1:07 am I've seen examples of age-adjusted RAPM such as this 14 year RAPM dataset but I don't understand how it's calculated.

I understand how to calculate vanilla RAPM, but how do you factor in age adjustments if it's a multi-year regression? I just don't get it at all tbh
It's all in the pre-processing and post-processing. Essentially, the observation is adjusted from the player's actual age to their age 27 (prime) age. So if the player is 20, then several points/100 possessions would be added to the observation to shift that player to age 27. This is done for all 10 players on the court.

That would yield "Age 27" ratings for all players from the RAPM.

Then, in post-processing, the age adjustment is subtracted back out, to yield the RAPM's best estimate of what that player actually did, averaged over the entire RAPM period.
Sorry I’m still a bit confused. What do you mean by observation? Like a stint? If for one stint you have the 10 players with a +/- of +5, for example, you’re changing that +/- based on the age of each player on the court?

And how exactly would you subtract it out? Because it would include a variety of different ages so how would that calculation work?

Sorry for all the questions, thanks for the help!
azuko
Posts: 5
Joined: Mon Aug 02, 2021 1:01 am

Re: How do you adjust multi-year RAPM for age?

Post by azuko »

viewtopic.php?p=33133&sid=53d4e7c394d1f ... 666#p33133

I just found this old explanation- is this the same process? Except where the prior = Age 27 - Age X value.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: How do you adjust multi-year RAPM for age?

Post by DSMok1 »

azuko wrote: Mon Aug 02, 2021 5:16 pm viewtopic.php?p=33133&sid=53d4e7c394d1f ... 666#p33133

I just found this old explanation- is this the same process? Except where the prior = Age 27 - Age X value.
Yes, that is the same process.

Now, since players age differently (and may have injuries, etc), I prefer a Bayesian RAPM approach. I like to use something like playing time (and quality of team) as a prior, rather than assuming the same aging curve for everyone.

And there are more sophisticated approaches than that.

To be clear--the RAPM generates exactly one value for the player for the whole period... to assign this to individual years within the stint is highly inaccurate, even if you use the aging curve or Bayesian prior to try to split it up.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
azuko
Posts: 5
Joined: Mon Aug 02, 2021 1:01 am

Re: How do you adjust multi-year RAPM for age?

Post by azuko »

DSMok1 wrote: Mon Aug 02, 2021 5:54 pm
azuko wrote: Mon Aug 02, 2021 5:16 pm viewtopic.php?p=33133&sid=53d4e7c394d1f ... 666#p33133

I just found this old explanation- is this the same process? Except where the prior = Age 27 - Age X value.
Yes, that is the same process.

Now, since players age differently (and may have injuries, etc), I prefer a Bayesian RAPM approach. I like to use something like playing time (and quality of team) as a prior, rather than assuming the same aging curve for everyone.

And there are more sophisticated approaches than that.

To be clear--the RAPM generates exactly one value for the player for the whole period... to assign this to individual years within the stint is highly inaccurate, even if you use the aging curve or Bayesian prior to try to split it up.
Thanks!

Just to be clear, when you say playing time and quality of team, are those two separate priors or one that is a function of both variables?

And I don't really understand your last sentence. Are you saying that using the aging curve or Bayesian prior is ineffective for single seasons and best fit for a multi-year sample? I tried calculating something like ~20 yr NPI RAPM but some of the results seemed off (even relative to RAPM results I found online, like the one I originally linked) which is why I thought I should incorporate some type of adjustment.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: How do you adjust multi-year RAPM for age?

Post by DSMok1 »

azuko wrote: Mon Aug 02, 2021 6:16 pm
DSMok1 wrote: Mon Aug 02, 2021 5:54 pm ...To be clear--the RAPM generates exactly one value for the player for the whole period... to assign this to individual years within the stint is highly inaccurate, even if you use the aging curve or Bayesian prior to try to split it up.
Thanks!

Just to be clear, when you say playing time and quality of team, are those two separate priors or one that is a function of both variables?

And I don't really understand your last sentence. Are you saying that using the aging curve or Bayesian prior is ineffective for single seasons and best fit for a multi-year sample? I tried calculating something like ~20 yr NPI RAPM but some of the results seemed off (even relative to RAPM results I found online, like the one I originally linked) which is why I thought I should incorporate some type of adjustment.
When I did the prior using playing time and quality of team, it was one function of both variables. The slope is different--low minutes players are similar on bad and good teams, but the slope of the line for quality vs. playing time is steeper on good teams than bad teams. I.E. a 0 MPG player is the same on bad and good teams, but a 36mpg player is much better on good teams than bad teams.

My last sentence: with a very long-term RAPM, your output is still just one single value for the player. I.E. Kobe = +3.0. That has very little value for individual seasons within the long term RAPM. He wasn't +3.0 for the whole term. He averaged +3.0 (in the eyes of the RAPM) over that term.

The reason using a prior or an age adjustment is so important in long term RAPM: If the RAPM has a lot of evidence that LeBron is a +7.0 player, that will really skew the RAPM's perception of his teammates in his rookie season, when he was not actually +7.0. It will think they were really, really terrible in order for them to have the results they did.

A final reason for using the prior: low minutes players are pulled toward 0 by vanilla RAPM. That makes no sense, and it ends up skewing the entire results. The low minute cadre's overrating biases all of the other results downwards, especially the moderate-minute players.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
azuko
Posts: 5
Joined: Mon Aug 02, 2021 1:01 am

Re: How do you adjust multi-year RAPM for age?

Post by azuko »

DSMok1 wrote: Tue Aug 03, 2021 12:46 pm
azuko wrote: Mon Aug 02, 2021 6:16 pm
DSMok1 wrote: Mon Aug 02, 2021 5:54 pm ...To be clear--the RAPM generates exactly one value for the player for the whole period... to assign this to individual years within the stint is highly inaccurate, even if you use the aging curve or Bayesian prior to try to split it up.
Thanks!

Just to be clear, when you say playing time and quality of team, are those two separate priors or one that is a function of both variables?

And I don't really understand your last sentence. Are you saying that using the aging curve or Bayesian prior is ineffective for single seasons and best fit for a multi-year sample? I tried calculating something like ~20 yr NPI RAPM but some of the results seemed off (even relative to RAPM results I found online, like the one I originally linked) which is why I thought I should incorporate some type of adjustment.
When I did the prior using playing time and quality of team, it was one function of both variables. The slope is different--low minutes players are similar on bad and good teams, but the slope of the line for quality vs. playing time is steeper on good teams than bad teams. I.E. a 0 MPG player is the same on bad and good teams, but a 36mpg player is much better on good teams than bad teams.

My last sentence: with a very long-term RAPM, your output is still just one single value for the player. I.E. Kobe = +3.0. That has very little value for individual seasons within the long term RAPM. He wasn't +3.0 for the whole term. He averaged +3.0 (in the eyes of the RAPM) over that term.

The reason using a prior or an age adjustment is so important in long term RAPM: If the RAPM has a lot of evidence that LeBron is a +7.0 player, that will really skew the RAPM's perception of his teammates in his rookie season, when he was not actually +7.0. It will think they were really, really terrible in order for them to have the results they did.

A final reason for using the prior: low minutes players are pulled toward 0 by vanilla RAPM. That makes no sense, and it ends up skewing the entire results. The low minute cadre's overrating biases all of the other results downwards, especially the moderate-minute players.
Awesome, I understand now. Thank you for your help!
Post Reply