Constructive discussion re: RAPM

sportsandmath1 · Post by **sportsandmath1** » Sat Mar 01, 2025 2:49 pm

Thought I'd create an account here for more meaningful discussions.

I've layed out all my thoughts on RAPM here (https://x.com/sportsandmath1/status/189 ... 52605?s=46) and

@Crow/bballstrategy suggested to "take your case directly to one or more authors of RAPM or hybrids, if they are willing to spend the time (maybe not)"

I know I might come off as arrogant, but that's just the result of frustration on the current state of RAPM (and related measures). I'm not claiming to be an oracle, just airing out my grievances and suggesting a better path that clearly hasn't been explored enough if I'm the only one in the area (it appears).

I'd prefer if responses were from people who have worked extensively on RAPM (Jeremias Engelmann) or a SPM (e.g. RPM - Steve Ilardi , BPM - Daniel Myers, DARKO - Kostya Medvedovsky, LEBRON - Krishna Narsu, EPM - Taylor Snarr, etc.) so that we have a set of rankings using your methodology to reference and ideally if you have points of agreement with me you can incorporate them in the next iteration of your player rankings.

I'm not going to rehash the entire Twitter discussion (you can check for yourself).

But here's a quick summary of the key points of contention I'd like to discuss on here:

Start of discussion:
1. https://x.com/sportsandmath1/status/189 ... 64322?s=46

In response to a recent article on ESPN by Zach Kram, with a headline suggesting LeBron is "unlucky" this season,

I noted that I don't put much credence into On/Off since while I value On-Court +/- (3rd most impactful stat in my player rankings after 1. PTS (positive) , 2. FGA (negative)), Off-Court +/- is largely noise that should be considered circumstantial evidence at best.

I quoted a tweet with my player rankings for this season (https://x.com/sportsandmath1/status/189 ... 90177?s=46) using the formula which measures % of production per game to rank players (quite well imho).

[/begin tangent]

Source for stat hierarchy (initially made as a response to the opposite critique that +/- is useless) : https://x.com/sportsandmath1/status/182 ... 01455?s=46

Here's the thread where I discuss the importance of +/- and how it is the key box score stat missing from a stat like GameScore referencing how Bruno Cabaclo was 1st in GameScore vs Team USA in a blowout loss and how the inclusion of +/- drops him to 8th best player in the game: https://x.com/sportsandmath1/status/182 ... 89833?s=46

[/end of tangent]

As an example, I noted that A'ja Wilson won the 2024 WNBA MVP unanimously with an On-Off of -2.7. Her On-Court +/- was +6.5

This is evidence in support of my claim that On/Off is quite useless, but +/- is definitely relevant.

[/discussion about a reply]
Crow replied with an irrelevant comment using highly filtered lineup data (low sample size) and implied that one lineup with LeBron being +13 is somehow bad because that lineup without LeBron was +42. This comment reinforced my sentiment that Off-Court +/- is useless.

https://x.com/bballstrategy/status/1895 ... 73909?s=46

I replied repeating the statement that Off-Court +/- is largely irrelevant and that combining On-Court +/- (useful) with Off-Court (useless) creates circumstantial evidence (LeBron's RAPM is negative) that is relatively meaningless when determining the veracity of the claim that LeBron is an All-NBA Caliber player at age 40 (should be quite obvious for NBA fans who watch the games)

[/end of discussion about a reply]

Beginning of my 7 tweet thread (please read in entirely before responding):

https://x.com/sportsandmath1/status/189 ... 52605?s=46

"""

Quick rant on why RAPM is inherently plagued with noise:

It weighs Off-Court +/- (doesn't correlate to impact aka noise) quite heavily alongside On-Court +/- (does correlate to impact)

while ignoring PTS, FGA, and other stats that actually are related to impact.

--

When the "gold standard" of RAPM with box prior (EPM) says DFS > LeBron,

the problem isn't luck, it's that RAPM has nothing to do with impact.

It's just the answer to a linear algebra problem requiring 500³ operations, which intrigues mathematicians but isn't useful in practice

--

So to all my fellow analytics nerds,

Please stop viewing RAPM as some elegant approach to solve basketball.

It's just mixing quality data (On-Court +/-) with unrelated garbage (Off-Court +/-)

with a veneer of sophistication.

There's no needle in the haystack (of +/- data).

--

ChatGPT explanation:

- Basically if you're still a believer in RAPM. the regularization needs to be turned up way higher to tune out the noise which brings the values much closer to zero.

Then it may be marginally more useful than +/-.

--

The high regularization version of RAPM would essentially be a scaled down version of +/-

that still needs to be combined with stats related to impact (not RAPM) like PTS, FGA, etc.

--

RAPM is not a valid substitute for Impact.

That's why trying to predict RAPM (less related to Impact) using box score stats (more related to Impact) isn't a smart idea.

--

So if you want to train an RAPM model, the optimal regularization parameter is not the one that best predicts RAPM on unseen data

but rather something that's more related to Player Impact like unseen game results when combined with the box score.

----

End of thread

Follow-up reply chain

Final tweet, then scroll up:
https://x.com/sportsandmath1/status/189 ... 57309?s=46

Key points:
1. My player rankings explain over 90% of variance in player rankings. This approach can be iterated to hopefully approach 95%

To clarify I'm referencing the NBA Math Crystal Basketball rankings from 2018 (train), 2019 (train) , 2020 (test) , 2021 (test) which ranked all ~500 players on a 1-12 scale aggregating the opinions of 10-15 people such as Ben Taylor).

2. No one can predict RAPM reliably since it's plagued with noise.

3. RAPM isn't meaningful on its own (see thread for why).
It needs to be combined alongside the box score (most notably INDIVIDUAL PTS and FGA)
to best predict impact (again RAPM ≠ Impact)

4. Training the box score to predict RAPM is foolish it's like going into an room blindfolded and with ear plugs and trying to imagine what the room looks like solely based on distance data.

That's why you get nonsense like DFS > LeBron

5. It's all in the thread. Your comments come across as if you're ignoring the points in the thread and repeating your talking points.

I've suggested how to improve RAPM in the thread:

Namely to use it in ensemble with the box score to predict OOS games (rather than OOS RAPM)

I assume doing an approach that's based on empiricism would yield a much higher regularization (lambda) parameter which tunes out most of the off court noise that's currently added to RAPM making it much more similar to +/-.

6. The main reasons I prefer +/- to RAPM

a. It's a box score statistic capturing everything in and outside the individual stats.

b. it's additive and not biased (the regularization in RAPM introduces bias)

c. It doesn't claim to be an all-in-one metric the way RAPM does despite the inherent limitation of solely relying on +/- data (just one of 10+ factors).

d. It's plainly obvious that ranking N=500 players shouldn't require O(N³) = 125 million
computations.

That's what makes it obvious that RAPM is made by/for linear algebra nerds not NBA nerds.

It got repackaged into NBA All-in-one metrics that are quite meaningless.

If you have the time you should be able to

use my ranking engine (https://sportsandmath1.github.io/RankingEngine/) to rank a random subset of 50 NBA players in 5-10 minutes.

then try comparing the correlation of the Elo Ratings to my player rankings vs some form of RAPM and it's almost certain it'll correlate higher to mine for 99% of NBA fans/analysts.

The 1% who's rankings correlate better to RAPM are fully on the RAPM Kool-aid which I'd respect that we simply have a difference of opinion.

But for the other 99% there's some inherent dissonance between how they rank players and how RAPM does.

Quoted tweet re the ranking engine:

https://x.com/sportsandmath1/status/189 ... 19467?s=46

Hopefully that was enough to understand my perspective on RAPM and start a discussion on how we can improve the future of NBA Player Rankings by shifting away from viewing RAPM as a magical oracle back to what it is: a tool to adjust raw +/- data that should be added to our toolbox but not worshipped as a measure of the ground truth.

Crow · Post by **Crow** » Sat Mar 01, 2025 3:53 pm

Will watch & read. Not planning on saying more here than I already did there, at this time. A few more twitter comments to catch up with overnight responses.

I do encourage RAPM authors and serious analysts of it to engage at some level. On specific technical or philosophical points or in general.

Mike G · Post by **Mike G** » Sun Mar 02, 2025 12:09 am

You might browse this forum and see what others have already revealed about their systems.
I'm not sure how many will go to X , and I will not.

I checked your ranking optimizer , entered "LeBron, DFS", and it tells me they are equal at 1500.
It doesn't tell me what that number indicates or who DFS is.

Crow · Post by **Crow** » Sun Mar 02, 2025 1:15 am

DFS is Dorian Finney Smith.

To my understanding:

1500 is just an arbitrary starting point for player quality.

Use the slider at bottom to say in your opinion one on left is better using >s or worse < than the other, by a little or a lot.

Over time the score of the player will change based on the user votes by this method.

TeemoTeejay · Post by **TeemoTeejay** » Sun Mar 02, 2025 12:20 pm

sportsandmath1 wrote: ↑Sat Mar 01, 2025 2:49 pm -

I posted a long response sleep deprived and the site ate it up because I forgot to save it as a draft so I'll just do a fuller response in a few days instead, I somewhat get what your coming from even though I generally think you're pretty of base with alot of your criticisms and its far more an issue of interpretation of these numbers as something they are not, but you're being way too mean on twitter to crow for what essentially is a linear model that just replicates how human beings see a box score lol

Crow · Post by **Crow** » Mon Mar 03, 2025 5:30 pm

Next season, maybe LeBron, Darko or some form of RAPM will be added to the team win projection contest. Room for new statistically based metric entries as well.

Projection is one test. Side by side retrodiction another. For those interested in either.

A few retrodictions have been done in past, mostly elsewhere but might be one or more in a back thread.

The projection contest records are here, such as they are.

Crow · Post by **Crow** » Mon Mar 03, 2025 5:41 pm

Unfortunately, how much RAPM authors read here and respond is uncertain.

TeemoTeejay · Post by **TeemoTeejay** » Mon Mar 03, 2025 5:57 pm

Crow wrote: ↑Mon Mar 03, 2025 5:30 pm Next season, maybe LeBron, Darko or some form of RAPM will be added to the team win projection contest. Room for new statistically based metric entries as well.

Projection is one test. Side by side retrodiction another. For those interested in either.

A few retrodictions have been done in past, mostly elsewhere but might be one or more in a back thread.

The projection contest records are here, such as they are.

Darko would be first place, and LEBRON/EPM would be in 5th or 6th iirc, LEBRON and EPM were "predictive versions" so multiyear predictive with age adjustments as components to a greater model iirc, DARKO not sure if it was purely the metric or a larger model but there were preseason adjustments

Crow · Post by **Crow** » Mon Mar 03, 2025 6:32 pm

You've repeated these expectations.

If they don't exist to present, I am not buying it, yet.

I'll then say: it would be possible to run using published data at NBARAPM.com and some minutes set. If you want to try to prove or disprove the claims.

The minutes set is a major element, so ideally several would be tested to judge impact with these metrics and others.

BPM has been included in the contest. Would / will Darko or LeBron beat BPM with same / Pelton's minute set? I don't think they've all been scored in "the contest" in the same season before.

Test theories / claims.

Same suggestions for several versions of RAPM.

If interested enough.

And then do tests for a half dozen seasons or more.

One strong performance is not enough to "prove" anything. Darko appeared to win once, though not as an official entry. And was awful the other time.

A variety of entries have won, metric based and not, once or in several cases multiple times.

Darko could indeed do well again but it is not certain imo. It won its own metric comparison test at its beginning. Same for EPM. Would find a new neutral comparison worthwhile. There was a neutral study years ago but perhaps but was before these 2 arrived. The Sports Skeptic or Skeptical Sports or whatever. Not looking it up at the moment, though I and perhaps others "should".

My 3 wins came from entering (since the beginning) and using a multi-projection blend (though history suggests I should have been broader, at least in most seasons) and then active subjective adjustment (better some seasons than others, but trying pretty hard to be different). Beyond wins I had a good number of close to top or top thirds. And a few clunkers. There will be variance even with reasonable approaches. Seasons have some to many surprises.

TeemoTeejay · Post by **TeemoTeejay** » Mon Mar 03, 2025 11:43 pm

I'm not allowed to post MAMBA anymore I believe or at least I dont think im allowed to update it, but I do wonder how it would have done if I had either done a real projection like some of the other metric based predictions do with age adjustments and having wins be a bit more regressed to the mean, or made the RAPM itself better

Mike G · Post by **Mike G** » Tue Mar 04, 2025 1:18 am

.

.. a recent article on ESPN by Zach Kram, with a headline suggesting LeBron is "unlucky" this season,

I saw the headline but not the article. Can someone summarize?

Crow · Post by **Crow** » Tue Mar 04, 2025 1:28 am

Team and opponent 3pt success on / off hurting LeBron on team results. Call it all random variance or only partial or just unknown.

Calling it noise is speculation. Likely some or quite a bit to it but unproven speculation.

3pt "bad luck" is unprecedent in LeBron's career and almost unprecedented for anyone... but it has been this bad for somebody.

No solid understanding how much is actually him doing anything less well than others at this time. It can't be ruled out.

Mike G · Post by **Mike G** » Tue Mar 04, 2025 8:17 pm

Maybe he doesn't cover the arc so well, and he also sets the tone for others?
Maybe when AD was not on the floor, LeBron becomes the man in the middle?

Where does one find 3FG% on and off ?
I find the LAL 'lineups' page completely refutes my 2nd guess above.
Lakers-minus-opponent 3fg% with LeBron; with and without Anthony Davis; and with Finney-Smith (briefly):

Code: Select all

5-man   min   3fg%
w AD    636  -.022
wo AD   208  +.023
DFS      56  +.015
tot     844  -.011

4-man   min   3fg%
w AD   1749  -.028
wo?AD   842  +.009
DFS     129  +.084
tot    2591  -.016

3-man   min   3fg%
w AD   2202  -.037
wo?AD  2852  -.010
DFS     268  +.048
tot    2591  -.016

https://www.basketball-reference.com/te ... 5/lineups/
DFS appears twice in the most-used 5-man lineups w LeBron; once each in the 4- and 3-man.
For whatever reason, they did not have good 3-point results with LeBron and AD. So they went out and got a good perimeter defender, who can also make that shot.

? - The 4-man and (esp,) 3-man "wo?AD" simply means he's not listed, so he may or may not have played any given minutes with the others. These lineups overlap a good deal (see total min.) and the w-wo diminishes from .045 to .037 to .027 as the separation is muddled.
LeBron has played 1911 minutes to date.

TeemoTeejay · Post by **TeemoTeejay** » Wed Mar 05, 2025 8:15 am

Lebron was just unserious on defense till the vacation and ramped it up when AD got hurt vs Philly, some noise outside of that but he did get a lot better defensively as the year went on and now is probably among the teams best on that end, he is also 7 billion years old so it makes sense he wasn’t exerting himself there with AD

Mike G · Post by **Mike G** » Wed Mar 05, 2025 7:15 pm

LeBron has missed 4 games.

Code: Select all

date    opp  3fg 3fga  3fg%       Lakers   3fg%
12.08   Por   9   36   .250      13   34   .382
12.13   Min   9   32   .281      10   35   .286
12.28   Sac  11   40   .275      14   26   .538
2.08    Ind  12   38   .316      11   33   .333
totals:      41  146   .281      48  128   .375

This represents .067 of the season so far. The 3fg% advantage to LA in this sample is .094
Multiply those and it's .006 for the season.

APBRmetrics

Constructive discussion re: RAPM

Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM

Re: Constructive discussion re: RAPM