APBRmetrics

Posted: **Fri Mar 14, 2025 3:35 pm**

Broadly speaking the model reflects, in my opinion, the biases of the underlying era it is built on. I build on a broad era for my use day-to-day, but with a bias towards more recent data. This is because I perceive, and I think it is trivial to observe, that player assessment, or perhaps assessment of the underlying dynamics of basketball and which players suit them, have improved over time. The market efficiency of minutes allocation in the NBA has, I think, improved.

On an individual basis the model will tend to overpredict minutes for players who dominate their teammates in terms of on-paper production, and vice versa. Due to being fit relative to the team around them, however, this issue is not so large as it would be were you not adjusting for teammates.

Posted: **Fri Mar 14, 2025 4:30 pm**

https://www.apbr.org/metrics/viewtopic.php?t=10160
v-zero:
How do your correlations compare to these?

Posted: **Fri Mar 14, 2025 5:50 pm**

Hi everyone,

Sorry for the 2 week delay in checking this thread.

I was spending last week slandering Jokic's 30/20/20 since the statline didn't show the defense.

I'll catchup with all the replies and respond accordingly over the weekend.

To rephrase my argument, I don't think RAPM is complete garbage, just that it doesn't meet the standards for ranking players.

- Essentially any stat that tries to predict +/- or RAPM is destined for failure since the inherent noise makes it so that the upper bound on how much of the variance can be predicted is something like 50%.

- *70% not 50% for predicting Multiyear RAPM

I was going off memory,

but I assume the 50% figure will hold if you're looking for single year rankings which are necessary/relevant.

Single game is also useful/important
--
see the tweet for the full context: https://x.com/sportsandmath1/status/190 ... 82117?s=46

Generally (note I'm making assumptions) Dean Oliver agrees with my sentiment about RAPM and his Net PTS stat follows the general idea of ignoring RAPM for a box score centric approach that uses the +/- for context.

The main difference (I assume) is that his coefficients are handcrafted from his decades of experience whereas mine are based on linear regression to fit the opinion of 10-15 people. I'm not qualified to tune the coefficients (when I tried before I posted anything online c. 2019 the results were mixed)
--
Since there's an upper limit on how well RAPM or +/- can be predicted (let's say ~50% for single season, ~70% for multiseason),

we can say that RAPM or +/- data in general is made up of 70% of data that's useful for player rankings and 30% of "inherent noise" that's related to lineup effects, team context, Off-Court +/-, or randomness. My initial post calling it all Off-Court +/- was reductive.

when EPM, LEBRON, DARKO, BPM, xRAPM, etc. are trying to predict (long term) RAPM or +/-, the RAPM serves as a useful guide to get up to ~70% of the answer (long term, ~50% for a single season) but the other 30-50% of the answer simply doesn't exist in +/- data and trying to predict player rankings better than 70% via RAPM can be counterproductive since they're fitting to data that is unrelated to player rankings.

---------

Sorry if this isn't as polished as I hoped - I had this written out a bit better but the draft got deleted while switching tabs so I just did my best to rewrite what I had earlier.

Anyways, I'm looking forward to reading all the replies!

Posted: **Mon Mar 17, 2025 12:20 am**

TeemoTeejay wrote: ↑Sun Mar 02, 2025 12:20 pm
sportsandmath1 wrote: ↑Sat Mar 01, 2025 2:49 pm -
what essentially is a linear model that just replicates how human beings see a box score lol

Ok but at least it succeeds at something (btw I've used them - converted to APM) to predict future NBA Games with a log loss matching Vegas (~0.59) solely based on the players who played 5+ minutes (no need for projecting the minutes allocation).

All I see from stats aiming to align with (long term) RAPM are unnecessarily complicated methodologies with results that are subpar (compared to human judgement or Vegas).

Player Ranking is a field where we should expect to explain 90% of the variance very quickly. In contrast, any RAPM approach is inherently limited to explaining 70% of the variance after having 3 years of data. That's why I'm frustrated and think the current work out there is "subpar" when a simple formula (not a black box) just is better.

Fundamentally, basketball is largely linear (I do use a nonlinear function to convert Score% to APM which can be easily added to predict games or measure team strength).

This doesn't seem that productive of a thread after skimming the 60+ messages since no one's engaging with the core point that the goal should shift from predicting RAPM to OOS games.

The season long contest is somewhat besides the point since 9 months is a long span for injuries trades and improvement, we need to shift to a game by game approach.

The way to achieve that is to improve upon the minimal amount of work I did as a hobbyist (still far better results unfortunately) and switch from per possession to per game, and from nonlinear to linear, and including +/- as part of the box score not separate from it. Every box score stat is context dependent, +/- is not unique

Please ignore my tone and focus on the actual points since apparently the tone is causing y'all to revert to low level thinking where you view it as adversarial instead of constructive, so there's not much to glean when no new points are being made.

Posted: **Mon Mar 17, 2025 12:24 am**

TeemoTeejay wrote: ↑Sun Mar 09, 2025 7:55 pm
Mike G wrote: ↑Sun Mar 09, 2025 7:18 pmj

I’m not against the idea that it being an OT game might have inflated the totals a tad compared to what you’d expect, but the dude said Jalen Williams having 16-6-6 was the best performance that night because okc win by a lot or something it was very funny

Nah man you purposely being dense. It's just accounting for defense. Jokic gave up 119 points on 67% TS, Jaylin Williams gave up 49 points on 32% TS. That's 70 points and a 35% gap in opponent TS that doesn't show up in your standard box score.

Dean Oliver's net pts (via ESPN) has the same conclusion, actually had Jokic 6th.
https://espnanalytics.com/nba-daily-sum ... e=20250307

Note that Jokic would be first if you combine the stats to measure total production (matching what you expect)
https://x.com/SportsandMath1/status/1898838621408383478

However, the inflation adjustment simply just accounts for OT/pace/defense/league environment which is necessary (it ensures all the scores in a game are a constant 200% rather than 290 in a 149-141 game and 175 in 107-89 game)

but counterintuitive to people who just skim box scores.

Posted: **Mon Mar 17, 2025 12:27 am**

I’m not against the fact that RAPM might not be an ideal target l, I think there is some interesting ideas there, nor am I against the idea that the statistical achievement may slightly overinflate it because it was a high scoring Ot performance, but doubling down on Jaylin Williams having a better game essentially is legit absolute comedy lol

Like fundamentally the issue isn’t that the stat says X therefore it’s flawed, while my take on it remains the same in that while it may work better as a raw player ranking it simply isn’t something that really provides any meaningful insight, the issue is much more so that it’s fine if some weird results happen on the game level but that doesn’t mean the statistic is reality lol, it should be pretty obvious for a stat that relies on something like on court plus minus that the idea that it’s an absolute measure of reality on a single game sample sizes is simply nonsense, and that’s kind of the crux of my issue here, I think there is an interesting discussion to have over the caveats of rapm and rapm as a target metric, and I do think on court rating is far more important than off the court, and vzero has some pretty interesting thoughts in this thread, but like the original post’s alternative to only use on court plus minus is nonsensical as well, the idea that you can parse through individual defensive performance at a single game sample size, at the confidence level you’ve done in that Twitter exchange largely from on court plus minus is kind of absurd

Posted: **Mon Mar 17, 2025 1:12 am**

v-zero wrote: ↑Tue Mar 11, 2025 11:44 am Without wading in to all this discussion of luck and other such things, I feel the need to point out that the initial post in this thread smacks of a failure to understand what RAPM is. Statements such as the idea that RAPM is regularised in order to predict out of sample RAPM are entirely incorrect on their face. RAPM is regularised to best predict out of sample stint data using cross-validation, which isn't the underlying player RAPM values, it is the actual results of the stints. Anybody with even a cursory knowledge and understanding of the space understands this fact trivially.

Now, I am not a huge RAPM fan, I have made that pretty obvious. I believe it is overused as the dependent variable in order to try to come up with AIO metrics.

Now, what does have merit but isn't how I'd do it, is this idea of a player metric which is pointing at something else, whether that be a player ranking or otherwise.

I posited an idea along these lines around a decade ago. I got pretty roundly ignored, but the crux of the idea was simple:

Use teammate-marginal player per100 box score stats to predict the teammate-marginal percentage of game time a player is allocated by a coach.

Why is this maybe a good idea? Well coaches and their staff inarguably see more of what their players are, than we ever will. Coaches must allocate minutes in an attempt to maximise team success, and so this marginal allocation of time would seem to be very likely to correlate with actual ability, or at least contextual impact.

Now, there are some issues here: what about tanking teams? What about poorly managed teams? What about the fact that the box score is missing a lot of the game?

Well, for that last part, raw plus-minus is available for use within the teammate-marginal metric, and in the minds of a coach and their staff it is likely a better refined and understood thing, even if only in the abstract, than it is on its own sat there in the box score, propping up players riding the coattails of others.

As for poorly managed teams? Much like an ensemble metric, the coefficients found by such a regression (or whatever statistical puzzle mapper you choose to use) are taking the ensemble over time of a great many opinions, of a great many teammate combinations, and so we can hope they represent bias only in the sense that coaches, on average, likely do have a bias of some sort, toward a certain type of player. Over the span of time we can hope that this averages out tidily enough.

Finally, tanking teams? Well, one simple solution in all of this is to look only at the top-half of the league to perform this analysis. Where is the real competition for playing time? Where are the most difficult decisions being made? In the front offices where winning now matters.

So, if you want to build a metric along the idealised lines of the original post, I would strongly seek to persuade you to have a look in that dimly lit corner.

+/- and RAPM are not much different. Your missing the forest (the larger point that solely relying on +/- data only gets you 50-70% of the way) for the trees (a quick dunk to hinder my credibility). And on that point I was referring to how Engelmann calculated the confidence interval for RAPM (this relates to my point that RAPM being N^3 computations and biased with no error bars makes it less useful than +/-)

If RAPM is based on the ability to predict +/- for stints (impossible to have an R^2 above maybe 10 or 20% it really is fitting to the 80% of the inherent variance in stint level plus-minus aka luck).

That's why I suggested to predict actual games using RAPM as just one piece of the puzzle since the box score is definitely useful (+/- is part of the box score btw). This gives a clear baseline - Vegas and focusing on stint data just doubles down on the collinearity issues by asking for 10 variables to predict the plus minus of a (5?) minute stint - largely noise and needs to be adjusted for rubber band effect and many more issues.

So maybe training RAPM on OOS RAPM would be better since it has less noise, but doing so would be too computationally expensive and just would prove my point that RAPM is half related to player rankings and half about lineup effects, team context and plagued with noise and collinearity issues.

It's not idealized. They simply work at the goal: predicting >90% of the variance in player rankings and predicting NBA games on par with Vegas.

You need to focus on being constructive (that is ask for clarifications rather than assume I'm an idiot because I have a different conclusion than you) instead of destructive.

The second half about AIO (what does that stand for? Ok I guess its all in one) teammate marginal? coach minute allocation? Only half of the teams? sounds like word salad to me, man. If you're gonna hate (you didn't, just showed that you misunderstood) at least propose something that makes sense.

Ultimately, just realize, it really isn't that hard. We need to be constructive and build upon the simple but strong baseline I proposed instead of working on things that are unneccesarily complex for subpar results.

Posted: **Mon Mar 17, 2025 1:21 am**

sportsandmath1 wrote: ↑Mon Mar 17, 2025 12:20 am
TeemoTeejay wrote: ↑Sun Mar 02, 2025 12:20 pm
sportsandmath1 wrote: ↑Sat Mar 01, 2025 2:49 pm -
what essentially is a linear model that just replicates how human beings see a box score lol
-

But the issue isn’t necessarily with the points you made

I broadly agree with these statements
- rapm isn’t a perfect target to test on
I do generally agree with this, if the only way yours measuring, let’s say a prior, is by rapm fit, you likely miss information
- on court rating is more important than off court rating
Yes, I get what you’re saying here and while I wouldn’t go as extreme as you did I do agree that there’s much truth to that

The entire “I made this metric that does X and Y and it’s better than anything out there!” Is where it gets much more murky, like fundamentally what you did didn’t provide true insight, I agree on a 9 month sample many confounding factors can occur, so something doing well or not isn’t really something that’s a game changer, but it’s far more the idea that actually falsifying that it’s good beyond what simply is just building a model that can mimic how a human being perceives a box score and how important team winning is simply isn’t all too
Impressive or notable, in my opinion

There’s also a clear misunderstanding of data here because multiple authors of those metrics have explicitly said it’s not meant to be a player ranking, but the main issue with the weird tone you’ve kind of had is like, while I don’t broadly disagree with some of the points that you’ve made, I don’t think what you did is particularly impressive either nor anything deserving of much of a look at at all

Posted: **Mon Mar 17, 2025 1:23 am**

AIO I presume means All In One metric.

LIneup effects and team context as separate from player ratings is a view but the only perspective. Players help create lineup effects and team context at leaat ro some degree, imo, with their affect.

Your perspective comes thru as reasonable. In the heat of first take, I might have been too defensive of RAPM and not fully open to the new contender.

Boxscore prior-informed RAPM and statistical plus minus + raw plus minus are very different in approach but both recognize the need for awareness of the 2 dimensions of data.

I'd take an explicit blend of the two metrics / approaches as preferable to just one.

Lead on your method.

RAPM and variations will be there for those inclined to produce and use, alternatively or in addition to your direction. I don't use RAPM to the exclusion of other ethics or approaches but it might sometimes be the first or only metric mentioned at a particular moment as a quick draw with some potential insights beyond the boxscore. Especially knowing how much most statistical metrics leave out or perhaps twist in a not completely settled way.

The level of engagement can vary moment to moment, year to year.

All you can do is your best and stand ready to discuss as opportunities present.

Posted: **Mon Mar 17, 2025 1:34 am**

TeemoTeejay wrote: ↑Mon Mar 17, 2025 12:27 am it should be pretty obvious for a stat that relies on something like on court plus minus that the idea that it’s an absolute measure of reality on a single game sample sizes is simply nonsense, and that’s kind of the crux of my issue here, I think there is an interesting discussion to have over the caveats of rapm and rapm as a target metric, and I do think on court rating is far more important than off the court, and vzero has some pretty interesting thoughts in this thread, but like the original post’s alternative to only use on court plus minus is nonsensical as well, the idea that you can parse through individual defensive performance at a single game sample size, at the confidence level you’ve done in that Twitter exchange largely from on court plus minus is kind of absurd

I think your missing my point.

Single game +/- just provides necessary context

I had a whole thread (https://x.com/SportsandMath1/status/1821000888691089833) on it that I referenced earlier (Bruno Cabaclo 1st in GameScore in USA vs BRA blowout, 8th after including +/-), the short of it is that a +/- of 20 vs a +/- of -20 cet. par. is definitely useful information, especially in the context of linear regression.

The player with 30 points with +20 is like 2016 Curry or 2025 Shai, winning in a blowout only playing 3Q. They need a boost to their statline.

The player with 30 points with -20 would be a garbage time merchant like Cam Thomas, Jordan Poole, or Kyle Kuzma and need a negative term added to their statline.

even if you don't buy that (https://x.com/SportsandMath1/status/1824482334252601455) all I'm saying is that +/- is the most important stat outside of scoring since it captures everything that happens outside the traditional box score. Ignoring it would be a grave mistake.

Including +/- (remember the coefficient is just 0.34) in player rankings also ensures that adding up the ratings of all individual players correlates better with the result since the sum of +/- is just 5x the MOV (https://x.com/SportsandMath1/status/1737606014957158846 - sum of Hollinger GameScore or PIR (Euroleague) correlates 0.956 and 0.92 respectively with MOV whereas mine correlates 0.98 due to the inclusion of +/-)

Posted: **Mon Mar 17, 2025 1:43 am**

sportsandmath1 wrote: ↑Mon Mar 17, 2025 1:34 am
TeemoTeejay wrote: ↑Mon Mar 17, 2025 12:27 am league) correlates 0.956 and 0.92 respectively with MOV whereas mine correlates 0.98 due to the inclusion of +/-)

So like, you go into these conversations with a weird apriori belief that no one understands what you are saying when what you are saying is extremely rudimentary

I don’t have much issue with saying under specific goals and constraints raw +/- has value

It’s an absurd jump to argue that you can clearly quantify that it objectively means Jokic was the 5th best player and was far worse at basketball that day than Jaylin williams

Like no one is particularly missing the point of what you’re saying, you aren’t really saying anything that isn’t common sense or particularly groundbreaking, and there is truth to some points, but it’s like your playing a game of golf you hit a few birdies and on some holes ur like +72 lmao, you have some interesting (but not novel) points, but the absolute pure absurdity of believing in raw +/- on a SINGLE GAME LEVEL so hard that you are dying on the hill that Jokic was the 5th to 6th best player that night he had 30-20-20 and Jaylin Williams was the best player of the night solely because it captures non box score impact and therefore the two of them (Jaylen and joe) must have had historic defensive performances , is absurd

Posted: **Mon Mar 17, 2025 1:54 am**

TeemoTeejay wrote: ↑Mon Mar 17, 2025 1:43 am
sportsandmath1 wrote: ↑Mon Mar 17, 2025 1:34 am
TeemoTeejay wrote: ↑Mon Mar 17, 2025 12:27 am league) correlates 0.956 and 0.92 respectively with MOV whereas mine correlates 0.98 due to the inclusion of +/-)

So like, you go into these conversations with a weird apriori belief that no one understands what you are saying when what you are saying is extremely rudimentary

I don’t have much issue with saying under specific goals and constraints raw +/- has value

It’s an absurd jump to argue that you can clearly quantify that it objectively means Jokic was the 5th best player and was far worse at basketball that day than Jaylin williams

Like no one is particularly missing the point of what you’re saying, you aren’t really saying anything that isn’t common sense or particularly groundbreaking, and there is truth to some points, but it’s like your playing a game of golf you hit a few birdies and on some holes ur like +72 lmao, you have some interesting (but not novel) points, but the absolute pure absurdity of believing in raw +/- on a SINGLE GAME LEVEL so hard that you are dying on the hill that Jokic was the 5th to 6th best player that night solely because it captures non box score impact, is absurd

OK if it's so rudimentary why can't we build upon it? What I want is there to to be mainstream metrics that produce better player rankings than what I post on Twitter weekly/monthly.

https://x.com/SportsandMath1/status/1901021190451614087

Stop bringing up the Jaylin Williams > Jokic in one game sample if you didn't see that game. Overall, Jokic is 1st or 2nd and JayWill is roughly 180th. At least my measure has a std error per game of +/- 8.5%, so when the overall rankings use the optimal weighted average (every game, accounting for recency since talent changes over time) has an effective sample size of 40 games the error (due to randomness) in this measure is 1.4% (for reference the gap between Jokic/Shai and #3 Giannis is ~2x the std error. That's what my goal is - player rankings that work on the correct time scales and have the right amount of uncertainty. RAPM fails on both those measures.

Quit being a hater and let me know if there's a viable alternative to the player rankings I post (I haven't seen any). Therefore there needs to be more work done on improving player rankings

Posted: **Mon Mar 17, 2025 2:16 am**

sportsandmath1 wrote: ↑Mon Mar 17, 2025 1:54 am A

I’m only bringing that game up because you keep doubling down on it for no reason and it’s very funny

The issue is you’re kind of talking apples and oranges here, no one is confusing RAPM or AIO metrics as player rankings, because they aren’t player rankings. Even as someone that generally finds them a tad overhyped, even I would say that lol

Posted: **Tue Mar 18, 2025 1:48 am**

Feel free to enter this contest next season, if inclined, for a public test in context of performance of many other approaches.

A "pure RAPM" entry could be constructed and entered if any wanted to. Perhaps with a similar minutes set as one or more of the other entries.

Hybrid RAPM metrics have been included and may be added to this year's run.

I don't recall if pure RAPM has been entered previously.

Posted: **Tue Mar 18, 2025 2:41 am**

Any comments about how the stat coefficients in your metric compared with BPM or gmBPM?

https://www.basketball-reference.com/about/bpm2.html

How much of the average differences come from differences in the stat coefficients vs. your inclusion of raw plus minus data?

What would BPM or gmBPM look like with the additional of raw plus minus as you do it?

Which of these approaches retrospectively explains better? Would predict next season better, even if not designed to do so?

Any interest in applying your metric to college data?

Any active consideration of training RAPM on OOS RAPM for comparison to conventional RAPM or use in your metric, instead of raw +/- or somehow mated to it? How much do the two types of plus minus vary on average on their own and in impacts on your estimates if both were tracked?

Anybody else currently adding RAPM to a statistical metric instead of the other way around?

Any comments looking back at Dan Rosenbaum's original work and how "overall plus/minus ratings" were done at that time?

https://www.82games.com/comm30.htm

Anybody have comments about what would / might be gained and / or lost by going back to that approach over pure RAPM or boxscore prior informed RAPM?

APBRmetrics

Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM