Continuation of one metric prediction test discussion

Home for all your discussion of basketball statistical analysis.
Crow
Posts: 6250
Joined: Thu Apr 14, 2011 11:10 pm

Continuation of one metric prediction test discussion

Post by Crow » Sun Dec 07, 2014 7:55 pm

Can use this if you want, or not.

talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: Continuation of one metric prediction test discussion

Post by talkingpractice » Sun Dec 07, 2014 8:49 pm

my two cents, then ill again shush and get out of this continuing conversation.

there's only one correct way to do this ->

1, on day x, anyone participating would submit player values for day x+1.
2, these values would then be used to predict the games of day x+1. since the point is to compare metrics, youd just use HCA=3 for all teams, and ignore rest day effects. actual realized minutes played from day x+1 would be used.
3, repeat for all remaining days in season.
4, lowest RMSE vs the actual final scores is the best metric.

simple as that.

this ignores tons of important issues like time of season. obv a pure in-season NPI RAPM model would do awful in predicting games on the 3rd day of the season. obv it would do very well at the end of the season. etc.

i dont know why this has been argued in a weird back-and-forth bash for over a month, when the correct way in which to do this test is rather clear. also, of the 5 fully disclosed public domain metrics, "everyone already knows" that RPM will be best, BPM/EZPM will do well, WS will do meh, WP will do awful, etc. presumably, guys with random/nondisclosed blends and whatnot wouldnt be allowed to participate, as that would make no sense.

permaximum
Posts: 413
Joined: Tue Nov 27, 2012 7:04 pm

Re: Continuation of one metric prediction test discussion

Post by permaximum » Sun Dec 07, 2014 9:55 pm

talkingpractice wrote:my two cents, then ill again shush and get out of this continuing conversation.

there's only one correct way to do this ->

1, on day x, anyone participating would submit player values for day x+1.
2, these values would then be used to predict the games of day x+1. since the point is to compare metrics, youd just use HCA=3 for all teams, and ignore rest day effects. actual realized minutes played from day x+1 would be used.
3, repeat for all remaining days in season.
4, lowest RMSE vs the actual final scores is the best metric.

simple as that.
.
Completely agreed.
talkingpractice wrote:this ignores tons of important issues like time of season. obv a pure in-season NPI RAPM model would do awful in predicting games on the 3rd day of the season. obv it would do very well at the end of the season. etc.

i dont know why this has been argued in a weird back-and-forth bash for over a month, when the correct way in which to do this test is rather clear. also, of the 5 fully disclosed public domain metrics, "everyone already knows" that RPM will be best, BPM/EZPM will do well, WS will do meh, WP will do awful, etc. presumably, guys with random/nondisclosed blends and whatnot wouldnt be allowed to participate, as that would make no sense.
Compeletely disagreed.

1. The test was supposed to be about pure box-score metrics. I don't get why you gave NPI-RAPM example. You don't need a full season to make prediction tests for a box-score metric. However waiting for the full season is optimum.

2. If everyone knows that RPM is the best, BPM does well, WP is awful why do I still see WS and other metrics when someone tries to back up his opinions on players?

3. Blends are not allowed to participate because... they win by default, eh? If the answer is yes, why don't we use them? Thank god the last year's prediction contest winner used a blending approach and this year's winner is going to be Crow who uses a blend just like the last year's winner. So, I don't need to prove my words because that's what I always try to do.

Those 4 steps to find out the best metric is great. We don't need any restrictions. Use anything this universe offers to you whether it's vegas odds, other people's predictions or your eyes. Let's even forget about box-score only restriction and just focus on prediction.

Think about this for a second. I'm going to bet on a team tonight... Which metric/blend/rating (whatever you call it) should I use? We should be looking for that answer.

talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: Continuation of one metric prediction test discussion

Post by talkingpractice » Mon Dec 08, 2014 12:20 am

i sorta agree with some of your comments in the 2nd part here.

i think one issue we still disagree on is that if non public or blended metrics are used, its inevitable that people will be doing things to improve predictions. iow, its really simple to take a metric and make a 'blend' metric by simply qualitatively adjusting some guys that appear to be off ->

1, brow's RPM coming in to the season was rather obviously too low. so people using some sort of RPM/etc blend could probably have known to give a qualitative bump to brow's rating from day 1.
2, you could do similar things by removing much of the 2014 season data for guys like shaqlemore, schroder, etc who all have obv gotten much better this season.
3, indiana plays atlanta tomorrow. based on matchups, id argue that hibbert may be a bit worse tmrw than normal (see last years round 1 of the playoffs). so if i was doing this with 'hidden' numbers, id take some off of roy for tomorrow, and that add it back for their next game.

iow, if people are submitting 'blind' numbers (ie blends, or private metrics), then it would be impossible to know what sort of qualitative adjustments are going into the numbers. so even if someone was willing to run this contest (the correct way) for everyone, the moderator would need to basically be calcing the numbers on his own, or pulling them from a site (bbref for dsmok, espn for je) where it would be clear that qualitative adjustments arent being used.

i also need to quickly say (and ive said this ad infinitum on this site over the years) that player values are a very small part of the wagering battle. everyone should feel free to enter the market and try to make profits based on 'better' player values than RPM or BPM. that is highly likely to not end well for said person.

one other thing is that 1230 games isnt enough for this test, so youd have to do this for 2-3 years to really know which metric is best.

xkonk
Posts: 294
Joined: Fri Apr 15, 2011 12:37 am

Re: Continuation of one metric prediction test discussion

Post by xkonk » Mon Dec 08, 2014 1:00 am

I'm not sure I see what the problem is with blends or people who nudge their numbers. As long as the predictions are made completely out of sample, why couldn't we run the contest and find out "talkingpractice's intuiton" is the best player ranking system?

talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: Continuation of one metric prediction test discussion

Post by talkingpractice » Mon Dec 08, 2014 1:20 am

xkonk wrote:I'm not sure I see what the problem is with blends or people who nudge their numbers. As long as the predictions are made completely out of sample, why couldn't we run the contest and find out "talkingpractice's intuiton" is the best player ranking system?
i agree with this. but then it wouldnt be a contest between metrics anymore.

i guess my feeling is that in essence, this is a "try to beat RPM or BPM" contest. and nearly everyone who claims they have a private "metric" is really going to be using "metric plus intuition" if they simply submit numbers. so itd be unfair to compare people doing that to RPM or BPM.

DSMok1
Posts: 905
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Continuation of one metric prediction test discussion

Post by DSMok1 » Mon Dec 08, 2014 2:38 am

I agree with everything talkingpractice is saying here.

I would be interested in seeing how much better blends could be, though, as long as the blends were being tested on years they were not trained on.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Mike G
Posts: 4429
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Continuation of one metric prediction test discussion

Post by Mike G » Mon Dec 08, 2014 11:04 am

If the reigning chess champion is a computer, do all competitors have to be computers? We don't know what goes into a human's reactions, so are they not valid for competition?

Whoever or whatever wins a competition, that entity may then be evaluated for how and why it/he won. To some of us it's more interesting if there are more competitors. Conversely, if a more organic "metric" predicts better than a purely hypothetical one, it's going to still be better whether or not it was included in your strict definition of "valid entries".

There's not much motivation to win a contest, and then have it discovered that you fudged the numbers -- is there?
my two cents, then ill again shush and get out of this continuing conversation.
You may as well not say this any more. :)

talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: Continuation of one metric prediction test discussion

Post by talkingpractice » Mon Dec 08, 2014 4:57 pm

Mike G wrote:
my two cents, then ill again shush and get out of this continuing conversation.
You may as well not say this any more. :)
:lol:

Crow
Posts: 6250
Joined: Thu Apr 14, 2011 11:10 pm

Re: Continuation of one metric prediction test discussion

Post by Crow » Mon Dec 08, 2014 5:16 pm

Adjusting words spoken, blending positions and intent, is sometimes what happens. Again and again. Linear single mindedness may make sense often but it is not the only option.

sndesai1
Posts: 133
Joined: Fri Mar 08, 2013 10:00 pm

Re: Continuation of one metric prediction test discussion

Post by sndesai1 » Mon Dec 08, 2014 5:54 pm

I don't see an issue with using strict % blends in this test, as long as the % are disclosed.

However, I don't yet understand the purpose of using values that have completely subjective adjustments made to individual players here. We already have the annual predictions contest (and can do a retrodiction contest in the same vein) where that is a great and often winning approach.

I always thought the purpose of Neil's test was to rank methodologies that can be widely applied, not to just be able to recognize that a metric may undervalue a particular player and simply add 0.6 to their rating in order to "beat" that metric.

willguo
Posts: 26
Joined: Mon Nov 03, 2014 7:18 am

Re: Continuation of one metric prediction test discussion

Post by willguo » Tue Dec 09, 2014 9:22 am

talkingpractice wrote: 4, lowest RMSE vs the actual final scores is the best metric.
Why RMSE? If you make it a game prediction idea, then the obvious answer is to use the spread. That metric is actually relevant to something, whereas lowest RMSE is not.

You are remove a huge element of these 1 number metrics - standard error. Knowing a number without a confidence interval is useless.

You also need to get punished for crossing 0 - thinking A wins by 1 but they win by 4 is not nearly as damaging as thinking A wins by 1 but B wins by 1 instead.
permaximum wrote: Think about this for a second. I'm going to bet on a team tonight... Which metric/blend/rating (whatever you call it) should I use? We should be looking for that answer.
That would include interactions between teammates and opponent matchups, and is not the same at all as 1 metric prediction contests.
talkingpractice wrote: this ignores tons of important issues like time of season. obv a pure in-season NPI RAPM model would do awful in predicting games on the 3rd day of the season. obv it would do very well at the end of the season. etc.
The correct blend would be dependent on time of season, and that has to be factored in. (Assuming no using of priors allowed.)

permaximum
Posts: 413
Joined: Tue Nov 27, 2012 7:04 pm

Re: Continuation of one metric prediction test discussion

Post by permaximum » Tue Dec 09, 2014 1:08 pm

willguo wrote:
permaximum wrote: Think about this for a second. I'm going to bet on a team tonight... Which metric/blend/rating (whatever you call it) should I use? We should be looking for that answer.
That would include interactions between teammates and opponent matchups, and is not the same at all as 1 metric prediction contests.
Although you're actually right, one metric will be the winner one way or another. And that will probably or potentially mean it's less dependant on teammates and opponents. We should be looking for that factor too.

Anyways, I'm curious what do you suggest to use when I bet on tonight's games? My gut feeling besides you know. Because that's the ultimate question interests me. I have a feeling you know a few things about this.

Edit:

IMO the best metric should help me beat Vegas odds by combining it with HCA+REST (+ Coach factor, still not sure on this one though)

AcrossTheCourt
Posts: 237
Joined: Sat Feb 16, 2013 11:56 am

Re: Continuation of one metric prediction test discussion

Post by AcrossTheCourt » Tue Dec 09, 2014 6:00 pm

Here's one issue I have actually: using a metric trained on 2001 to 2014 data to tell you the value of certain stats decades before. Given all the huge changes the league has seen from the way teams are defended, handchecking, outside shooting, the lesser importance of post scoring, etc. I don't think that's ideal. So while everyone is complaining about out/in sample testing, I think there's still an issue with the sample being too far removed or different. Of course, a test like this can help identify these problems, but not in detail. It'd be interesting to see which players in the past are being systematically underrated because their type is more valuable.

Also, here's why we shouldn't use blends for this: we already know blends do better than separate metrics. This is not new information. Thus, before a blend is even built, we should see which metrics do best on their own separately to see, for example, which ones are worth collecting.

talkingpractice
Posts: 194
Joined: Tue Oct 30, 2012 6:58 pm
Location: The Alpha Quadrant
Contact:

Re: Continuation of one metric prediction test discussion

Post by talkingpractice » Tue Dec 09, 2014 6:48 pm

AcrossTheCourt wrote:Here's one issue I have actually: using a metric trained on 2001 to 2014 data to tell you the value of certain stats decades before. Given all the huge changes the league has seen from the way teams are defended, handchecking, outside shooting, the lesser importance of post scoring, etc. I don't think that's ideal.


agreed. we dont go back anywhere near 2001 on any of our stuff. it just doesnt apply much at all to tomorrow.
AcrossTheCourt wrote:Also, here's why we shouldn't use blends for this: we already know blends do better than separate metrics. This is not new information. Thus, before a blend is even built, we should see which metrics do best on their own separately to see, for example, which ones are worth collecting.
agreed.
willguo wrote:
talkingpractice wrote: 4, lowest RMSE vs the actual final scores is the best metric.
Why RMSE? If you make it a game prediction idea, then the obvious answer is to use the spread. That metric is actually relevant to something, whereas lowest RMSE is not.
You are remove a huge element of these 1 number metrics - standard error. Knowing a number without a confidence interval is useless.
You also need to get punished for crossing 0 - thinking A wins by 1 but they win by 4 is not nearly as damaging as thinking A wins by 1 but B wins by 1 instead.
i 'think' that i agree with most of this. but i think the RMSE test will pick the same "winner", per se, in 99% of cases. you may be right, tho.

Post Reply