Shots at RPM

Home for all your discussion of basketball statistical analysis.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

@mystic

Show me some data and it's ridge regression results then we'll see what's wrong here. Something has to be wrong. The data or the calculation process... I'm pretty sure there's nothing wrong at my calculation process.
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

Uh? You said you are using glmnet, which means you can test that by yourself. Increase the amount of possessions for a low possession player artificially either via weights or by adding the same lineup data to the raw data a couple of times. You can also plugin the lambda manually, use a higher (like 100) and a lower (like 10) value and then compare the standard deviation of the coefficients.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

No significant changes. It's not too hard or time consuming to calculate 2007-08 RAPM or the next seasons' with basketballvalue.com data. Just do it and let's see what's wrong here.
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

https://yourlogicalfallacyis.com/burden-of-proof

I leave it at that, because I can't take someone, who basically wants to argue that the weights and lambda have no effect on the result of a ridge regression, seriously.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

Calm dawn pal, I didn't say weights or lambda have no effect on the result of ridge regression. You're doing demagogy. I'm just saying players with few possessions can be at top in the results. If there's something wrong in the process, I'm saying let's find out.

However, not in theory, but practice. If you don't have the time for that fine... Still, I suggest you to let your work do the talking. Especially in a forum called APBRmetrics.

While we are talking about the words of work, why didn't the keeper of RPM participate in 2014-15 prediction contest?

Edit: I'm insisting on this one because I saw one case of yours here, you calculated the lambda wrong and skewed the results for ridge regression.

For this data: viewtopic.php?t=7919#p11777

You calculated it here : viewtopic.php?t=7919#p11782

So, there is only one incident for your ability to calculate ridge regression, and you did it wrong.... I'm an ameteur but you're a professional. You're not supposed to make those kind of mistakes.
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

permaximum wrote:I'm just saying players with few possessions can be at top in the results. If there's something wrong in the process, I'm saying let's find out.
Nothing wrong with the process, if such thing happens. Maybe we understood each other wrongly here, because I did not argue that such players can't have extreme values.
permaximum wrote: However, not in theory, but practice. If you don't have the time for that fine... Still, I suggest you to let your work do the talking. Especially in a forum called APBRmetrics.
Not quite sure what you even talking about, but well, let us start with you uploading the used raw data and the R script in order to check whether "no significant changes" is a truthful answer. Shall we?
permaximum wrote: While we are talking about the words of work, why didn't the keeper of RPM participate in 2014-15 prediction contest?
No idea, but I also don't see why that should matter.
permaximum wrote: Edit: I'm insisting on this one because I saw one case of yours here, you calculated the lambda wrong and skewed the results for ridge regression.

So, there is only one incident for your ability to calculate ridge regression, and you did it wrong.... I'm an ameteur but you're a professional. You're not supposed to make those kind of mistakes.
:lol:

Sorry, but that is so much bs, seriously, how would you know that the lambda was "calculated wrong"? I didn't use glmnet for that example, just saying ...
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

Because when you use the lambda that results minimum CV error you should find coefs around these values.

Intercept: 8.668964
Fig1 : 0.1984727
Fig2 : 0.1969857
Fig3 : 0.4239485

Let's look at your findings.

(Intercept) 20.2519966
Fig1 0.1828746
Fig2 0.1930801
Fig3 0.2523306

I didn't use glmnet here too BTW.
colts18
Posts: 313
Joined: Fri Aug 31, 2012 1:52 am

Re: Shots at RPM

Post by colts18 »

J.E. in your statistical Plus/minus that is used for RPM, do you split out turnovers that liveball (steals) and deadballs (all other turnovers). That should give slightly more credit to players whose turnovers are deadball (allowing the defense to recover) rather than having it stolen (resulting in an easier shot for the opponent).
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

colts18 wrote:J.E. in your statistical Plus/minus that is used for RPM, do you split out turnovers that liveball (steals) and deadballs (all other turnovers). That should give slightly more credit to players whose turnovers are deadball (allowing the defense to recover) rather than having it stolen (resulting in an easier shot for the opponent).
I once splitted them too along with blocked misses but found out that the difference didn't worth my trouble. However, it's probably not a "trouble" for him since he's using PBP data always.
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

permaximum wrote:Because when you use the lambda that results minimum CV error you should find coefs around these values.
I just ran the regression on those numbers and found the lambda to be 1.8, and subsequently coefficients close to yours. No idea how I arrived at "17" 2.5 years ago, maybe the result of the cv was 1.7 and I misread it and then plugged in the wrong lambda manually? The old results of the regression are based on a lambda of exact 17 (which points to that). As you see, even "pros" can make mistakes, and it would be silly to assume that somebody even well-trained in a subject would be infallible, especially if something like that is done as quickly as I did it back in 2012.

Anyway, that does not adress the issue at hand, where you claimed that a different amount of possessions or a change of the lambda wouldn't have an effect on the results of the low possession players. Are you willing to upload your used raw data (meaning the already prepared matrix for the regression) as well as your R-Script in order to let me or others check, if the results would be indeed no different?
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

Here's the data. And the R script. It is so simple...

Code: Select all

memory.limit(6051)
library(doParallel)
library(glmnet)
rapm <- read.table("c:/users/****/yyyyy/rapm2008.csv", sep=";", header=TRUE)
require(doParallel)
registerDoParallel(3)
x <- as.matrix(rapm[c("O1","O2","O3","O4","O5","O6","O7","O8","O10","O11","O12","O14","O15","O16","O17","O19","O20","O21","O22","O23","O24","O25","O26","O28","O29","O30","O31","O32","O33","O34","O35","O36","O37","O39","O40","O41","O42","O44","O45","O46","O47","O48","O49","O50","O51","O52","O54","O55","O56","O59","O60","O61","O62","O63","O64","O65","O66","O67","O68","O69","O70","O71","O72","O73","O74","O75","O76","O77","O78","O79","O80","O83","O85","O86","O87","O88","O89","O90","O91","O92","O93","O94","O95","O96","O97","O98","O100","O101","O103","O109","O110","O111","O112","O113","O114","O115","O116","O117","O118","O119","O121","O122","O123","O124","O125","O126","O127","O128","O130","O131","O132","O133","O136","O137","O138","O139","O140","O141","O143","O144","O145","O146","O147","O148","O149","O150","O151","O152","O154","O155","O156","O158","O159","O161","O163","O164","O165","O166","O167","O168","O169","O170","O171","O172","O173","O174","O176","O180","O181","O185","O186","O187","O188","O189","O190","O191","O193","O194","O196","O197","O198","O199","O200","O201","O202","O203","O204","O205","O206","O209","O210","O211","O212","O214","O215","O218","O220","O221","O222","O223","O224","O226","O231","O235","O236","O237","O238","O239","O240","O241","O243","O244","O245","O246","O247","O249","O250","O251","O252","O253","O254","O255","O257","O258","O260","O261","O262","O263","O264","O265","O266","O269","O270","O271","O272","O273","O274","O275","O276","O277","O278","O279","O280","O281","O283","O284","O285","O286","O287","O288","O289","O292","O293","O294","O295","O296","O297","O298","O300","O301","O302","O304","O305","O306","O307","O308","O309","O311","O312","O313","O315","O316","O317","O318","O319","O320","O321","O322","O323","O324","O326","O327","O328","O329","O330","O331","O332","O337","O344","O348","O349","O353","O354","O355","O356","O357","O362","O366","O367","O372","O373","O377","O382","O383","O389","O390","O392","O401","O403","O408","O409","O410","O411","O412","O413","O416","O417","O418","O419","O426","O430","O433","O436","O439","O444","O450","O453","O454","O458","O459","O460","O467","O468","O470","O472","O477","O483","O484","O486","O488","O489","O492","O501","O505","O508","O510","O514","O515","O516","O517","O518","O525","O531","O537","O541","O544","O550","O551","O552","O553","O555","O556","O557","O558","O559","O560","O561","O562","O563","O565","O567","O569","O572","O573","O574","O575","O577","O578","O579","O580","O581","O585","O588","O589","O592","O593","O594","O595","O596","O597","O598","O599","O600","O601","O602","O605","O606","O607","O608","O609","O610","O612","O614","O617","O622","O629","O636","O640","O643","O644","O646","O650","O653","O655","O659","O660","O661","O662","O663","O664","O665","O666","O667","O668","O669","O670","O671","O672","O673","O674","O675","O676","O677","O678","O679","O680","O681","O682","O683","O684","O685","O686","O687","O688","O689","O690","O691","O692","O693","O694","O695","O696","O697","O698","O699","O700","O701","O702","O703","O704","O705","O706","O707","O708","O709","O710","O711","O712","O713","O714","O715","O716","O717","O718","O719","O720","O721","O722","O723","O724","O725","O726","O727","O728","O729","O730","O731","O732","O801","O802","O803","O804","O805","O806","O807","O808","O809","O810","O811","O812","O813","O814","O815","O816","O817","O818","O819","O820","O821","O822","O823","O824","O825","O826","O827","O828","O829","O830","O831","O832","O833","O834","O835","O836","O837","O838","O839","O840","O841","O842","O843","O844","O845","O846","O847","O848","O849",

y <- as.matrix(rapm[c("Rating")])
fit <- cv.glmnet(x,y,alpha=0,weights=rapm$Poss,nfolds=50,parallel=TRUE)
coef(fit)
Lambda.min = 71.88326
Crow
Posts: 10533
Joined: Thu Apr 14, 2011 11:10 pm

Re: Shots at RPM

Post by Crow »

Lots of choices with RAPM. One could emphasize minimizing errors of the top performers, top salaries, next to be free agents, one conference, one team, one position, offense or defense...
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

permaximum wrote:Here's the data. And the R script. It is so simple...
The data link tells me in Turkish that the file has been removed or is currently unavailable.

Also, I'm not exactly sure we are talking about the same issue here, because I wanted to see the script which you used to conclude this:
mystic wrote:You can also plugin the lambda manually, use a higher (like 100) and a lower (like 10) value and then compare the standard deviation of the coefficients.
permaximum wrote:No significant changes.
I'm pretty sure that the posted script is not doing that. The calculated coefficients you get are based on lambda.1se, not on lambda.min, btw. Another hint to get rid of that awful long part to convert the player part into the data.matrix, you can use brackets, if you know how many columns the data contains or if you want to use that script for various different files without knowing the exact amount of players included, while always structure the raw data that the last two columns contain Poss and Rating, you can use something like this:

Code: Select all

n<-length(rapm)-2
x<-rapm[1:n]
x<-data.matrix(x)
In order to check the standard deviation for the different coefficient sets calculated using different lambdas, you should have something like that (if you included the n<-length(rapm)-2):

Code: Select all

highfit<-glmnet(x,y,weights=rapm$Poss,alpha=0,lambda=10)
highsd<-coef(highfit)
highsd<-sd(highsd[2:n+1])
lowfit<-glmnet(x,y,weights=rapm$Poss,alpha=0,lambda=100)
lowsd<-coef(lowfit)
lowsd<-sd(lowsd[2:n+1])
print(highsd>lowsd)
If on your R console TRUE shows up, this
permaximum wrote: In theory, yes. In practice, no.
is proven wrong. And that's what we are actually talking about, not whether I can correctly run a ridge regression using R.
permaximum
Posts: 416
Joined: Tue Nov 27, 2012 7:04 pm

Re: Shots at RPM

Post by permaximum »

1. The data should be downloadable now.

2. Yes, the script is for calculating RAPM 07-08. However increasing the lambda is so simple. Try 700, 7000, 70000 for lambda.min instead of 70.

3. Yes, I typed the script from my mind except the "x" part. The last line should be "coef(fit, s="lambda.min") instead of "coef(fit)"

4. Thanks for the tip about shortening that line. However I remove some players from the code sometimes. Don't want to work on the data.

5. Here's your ridge-regression ready data. Show me those in practice and let me see Kevin Garnett on top with a different lambda instead of those players who have significantly less posssesions. And I'll admit that you're right... Even if you can do that, with the best lambda you get these results which has a few players who have low possessions on top.
mystic
Posts: 470
Joined: Mon Apr 18, 2011 10:09 am
Contact:

Re: Shots at RPM

Post by mystic »

1. I got it.

2. Well, I showed the code in my previous post ...

3. Fine.

4. Well, you could remove the particular players via setting the respective column-name to NULL.

5. It still seems that you don't understand what I actually said and bringing up a strawman instead. I wrote that the maths tells us that increasing lambda is making it more difficult to get away from 0. We can check that by calculating the respective SD of the coefficient sets for a smaller and a bigger lambda value. If what I say is true, the respective SD for the set calculated with the lower lambda should be higher than that of the set calculated with the higher lambda value.

Here is the script I just used on your dataset:

Code: Select all

library(glmnet)
matchups<-read.csv("rapm2008.csv",sep=";")
Weights<-matchups$Poss
Weights<-data.matrix(Weights)
y<-matchups$Rating
y<-data.matrix(y)
n<-length(matchups)
x<-matchups[3:n]
x<-data.matrix(x)
highfit<-glmnet(x,y,weights=Weights,alpha=0,lambda=10)
highsd<-coef(highfit)
highsd<-sd(highsd[2:n+1],na.rm=TRUE)
lowfit<-glmnet(x,y,weights=Weights,alpha=0,lambda=100)
lowsd<-coef(lowfit)
lowsd<-sd(lowsd[2:n+1],na.rm=TRUE)
print(highsd>lowsd)
That is the answer on my console:

Code: Select all

[1] TRUE
and then this:

Code: Select all

> highsd
[1] 5.6968
> lowsd
[1] 2.43546
Let us compare that with your statement:
permaximum wrote:No significant changes.
Well, I would say the difference between 5.7 and 2.4 is significant.

Btw, I have no idea which of those IDs is Kevin Garnett or any other player for that matter.
Post Reply